Issue21163
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2014-04-05 21:10 by richard.kiss, last changed 2022-04-11 14:58 by admin. This issue is now closed.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
asyncio-gc-issue.py | richard.kiss, 2014-04-05 21:10 | |||
log_destroyed_pending_task.patch | vstinner, 2014-06-19 14:48 | review | ||
dont_log_pending.patch | vstinner, 2014-06-25 22:27 | review | ||
test2.py | mpaolini, 2014-08-18 14:20 | script showing gc collecting unreferenced asyncio tasks | ||
issue_22163_patch_0.diff | mpaolini, 2014-08-18 19:36 | hold references to all waited futures in task | review | |
test3.py | mpaolini, 2014-08-19 09:32 | script showing real-life-like example gc collecting unreferenced asyncio tasks |
Messages (37) | |||
---|---|---|---|
msg215633 - (view) | Author: Richard Kiss (richard.kiss) * | Date: 2014-04-05 21:10 | |
Some tasks created via asyncio are vanishing because there is no reference to their resultant futures. This behaviour does not occur in Python 3.3.3 with asyncio-0.4.1. Also, doing a gc.collect() immediately after creating the tasks seems to fix the problem. Attachment also available at https://gist.github.com/richardkiss/9988156 |
|||
msg215638 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2014-04-05 23:05 | |
Ouch. That example is very obfuscated -- I fail to understand what it is trying to accomplish. Running it I see that it always prints 100 for the count with 3.3 or DO_CG on; for me it prints 87 with 3.4 an DO_GC off. But I wouldn't be surprised if the reason calling do.collect() "fixes" whatever issue you have is that it causes there not to be any further collections until the next cycle through that main loop, and everything simply runs before being collected. But that's just a theory based on minimal understanding of the example. I'm not sure that tasks are *supposed* to stay alive when there are no references to them. It seems that once they make it past a certain point they keep themselves alive. One more thing: using a try/except I see that the "lost" consumers all get a GeneratorExit on their first iteration. You might want to look into this. (Sorry, gotta run, wanted to dump this.) |
|||
msg215639 - (view) | Author: Richard Kiss (richard.kiss) * | Date: 2014-04-05 23:24 | |
I agree it's confusing and I apologize for that. Background: This multiplexing pattern is used in pycoinnet, a bitcoin client I'm developing at <https://github.com/richardkiss/pycoinnet>. The BitcoinPeerProtocol class multiplexes protocol messages into multiple asyncio.Queue objects so each interested listener can react. An example listener is in pycoinnet.helpers.standards.install_pong_manager, which looks for "ping" messages and sends "pong" responses. When the peer disconnects, the pong manager sees a None (to indicate EOF), and it exits. The return value is uninteresting, so no reference to the Task is kept. My client is in late alpha, and mostly works, but when I tried it on Python 3.4.0, it stopped working and I narrowed it down to this. I'm not certain this behaviour is incorrect, but it's definitely different from 3.3.3, and it seems odd that a GC cycle BEFORE additional references can be made would allow it to work. |
|||
msg215640 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2014-04-05 23:30 | |
Most likely your program is simply relying on undefined behavior and the right way to fix it is to keep strong references to all tasks until they self-destruct. |
|||
msg215642 - (view) | Author: Richard Kiss (richard.kiss) * | Date: 2014-04-05 23:37 | |
I'll investigate further. |
|||
msg215646 - (view) | Author: Richard Kiss (richard.kiss) * | Date: 2014-04-06 02:12 | |
You were right: adding a strong reference to each Task seems to have solved the original problem in pycoinnet. I see that the reference to the global lists of asyncio.tasks is a weakset, so it's necessary to keep a strong reference myself. This does seem a little surprising. It can make it trickier to create a task that is only important in its side effect. Compare to threaded programming: unreferenced threads are never collected. For example: f = asyncio.Task(some_coroutine()) f.add_callback(some_completion_callback) f = None In theory, the "some_coroutine" task can be collected, preventing "some_completion_callback" from ever being called. While technically correct, it does seem surprising. (I couldn't get this to work in a simple example, although I did get it to work in a complicated example.) Some change between 3.3 and 3.4 means garbage collection is much more aggressive at collecting up unreferenced tasks, which means broken code, like mine, that worked in 3.3 fails in 3.4. This may trip up other early adopters of tulip. Maybe adding a "do_not_collect=True" flag to asyncio.async or asyncio.Task, which would keep a strong reference by throwing it into a singleton set (removing it as a future callback) would bring attention to this subtle issue. Or displaying a warning in debug mode when a Task is garbage-collected before finishing. Thanks for your help. |
|||
msg215647 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2014-04-06 03:35 | |
Thanks for understanding. It's definitely subtle: there is also some code in asyncio that logs an error when a Future holding an exception becomes unreachable before anyone has asked for it; this has been a valuable debugging tool, and it depends on *not* holding references to Futures. Regarding the difference between Python 3.3 and 3.4, I don't know the exact reason, but GC definitely gets more precise with each new revision, and there are also some differences in how exactly the debug feature I just mentioned is implemented (look for _tb_logger in asyncio/futures.py). OTOH it does seem a little odd to just GC a coroutine that hasn't exited yet, and I'm not 100% convinced there *isn't* a bug here. The more I think about it, the more I think that it's suspicious that it's always the *first* iteration that gets GC'ed. So I'd like to keep this open as a reminder. Finally, I'm not sure the analogy with threads holds -- a thread manages OS resources that really do have to be destroyed explicitly, but that's not so for tasks, and any cleanup you do need can be handled using try/finally. (In general drawing analogies between threads and asyncio tasks/coroutines is probably one of the less fruitful lines of thinking about the latter; there are more differences than similarities.) |
|||
msg220986 - (view) | Author: STINNER Victor (vstinner) * | Date: 2014-06-19 14:48 | |
Ok, I agree that this issue is very tricky :-) The first problem in asyncio-gc-issue.py is that the producer keeps *weak* references to Queue object, so the Queue objects are quickly destroyed, especially if gc.collect() is called explicitly. When "yield from queue.get()" is used in a task, the task is paused. The queue creates a Future object and the task registers its _wakeup() method into the Future object. When the queue object is destroyed, the internal future object (used by the get() method) is destroyed too. The last reference to the task was in this future object. As a consequence, the task is also destroyed. While there is a bug in asyncio-gc-issue.py, it's very tricky to understand it and I think that asyncio should help developers to detect such bugs. I propose attached patch which emits a warning if a task is destroyed whereas it is not done (its status is still PENDING). I wrote a unit test which is much simpler than asyncio-gc-issue.py. Read the test to understand the issue. I added many comments to explain the state. -- My patch was written for Python 3.4+: it adds a destructor to the Task class, and we cannot add a destructor in Future objects because these objects are likely to be part of reference cycles. See the following issue which proposes a fix: https://code.google.com/p/tulip/issues/detail?id=155 Using this fix for reference cycle, it may be possible to emit also the log in Tulip (Python 3.3). |
|||
msg221010 - (view) | Author: Richard Kiss (richard.kiss) * | Date: 2014-06-19 18:08 | |
The more I use asyncio, the more I am convinced that the correct fix is to keep a strong reference to a pending task (perhaps in a set in the eventloop) until it starts. Without realizing it, I implicitly made this assumption when I began working on my asyncio project (a bitcoin node) in Python 3.3. I think it may be a common assumption for users. Ask around. I can say that it made the transition to Python 3.4 very puzzling. In several cases, I've needed to create a task where the side effects are important but the result is not. Sometimes this task is created in another task which may complete before its child task begins, which means there is no natural place to store a reference to this task. (Goofy workaround: wait for child to finish.) |
|||
msg221096 - (view) | Author: STINNER Victor (vstinner) * | Date: 2014-06-20 15:16 | |
> The more I use asyncio, the more I am convinced that the correct fix is to keep a strong reference to a pending task (perhaps in a set in the eventloop) until it starts. The problem is not the task, read again my message. The problem is that nobody holds a strong reference to the Queue, whereas the producer is supposed to fill this queue, and the task is waiting for it. I cannot make a suggestion how to fix your example, it depends on what you want to do. > Without realizing it, I implicitly made this assumption when I began working on my asyncio project (a bitcoin node) in Python 3.3. I think it may be a common assumption for users. Ask around. I can say that it made the transition to Python 3.4 very puzzling. Sorry, I don't understand the relation between this issue and the Python version (3.3 vs 3.4). Do you mean that Python 3.4 behaves differently? The garbage collection of Python 3.4 has been *improved*. Python 3.4 is able to break more reference cycles. If your program doesn't run anymore on Python 3.4, it means maybe that you rely on reference cycle, which sounds very silly. > In several cases, I've needed to create a task where the side effects are important but the result is not. Sometimes this task is created in another task which may complete before its child task begins, which means there is no natural place to store a reference to this task. (Goofy workaround: wait for child to finish.) I'm not sure that this is the same issue. If you think so, could you please write a short example showing the problem? |
|||
msg221375 - (view) | Author: STINNER Victor (vstinner) * | Date: 2014-06-23 21:00 | |
@Guido, @Yury: What do you think of log_destroyed_pending_task.patch? Does it sound correct? Or would you prefer to automatically keep a strong reference somewhere and then break the strong reference when the task is done? Such approach sounds to be error prone :) |
|||
msg221387 - (view) | Author: Yury Selivanov (yselivanov) * | Date: 2014-06-24 00:37 | |
>@Guido, @Yury: What do you think of log_destroyed_pending_task.patch? Does it sound correct? Premature task garbage collection is indeed hard to debug. But at least, with your patch, one gets an exception and has a chance to track the bug down. So I'm +1 for the patch. As for having strong references to tasks: it may have its own downsides, such as hard to debug memory leaks. I'd rather prefer my program to crash and/or having your patch report me the problem, than to search for an obscure code that eats all server memory once a week. I think we need to collect more evidence that the problem is common & annoying, before making any decisions on this topic, as that's something that will be hard to revert. Hence I'm -1 for strong references. |
|||
msg221470 - (view) | Author: Guido van Rossum (Guido.van.Rossum) | Date: 2014-06-24 15:48 | |
Patch looks good. Go ahead. |
|||
msg221495 - (view) | Author: Richard Kiss (Richard.Kiss) | Date: 2014-06-24 20:29 | |
I reread more carefully, and I am in agreement now that I better understand what's going on. Thanks for your patience. |
|||
msg221503 - (view) | Author: STINNER Victor (vstinner) * | Date: 2014-06-24 22:01 | |
I commited my change in Tulip (78dc74d4e8e6), Python 3.4 and 3.5: changeset: 91359:978525270264 branch: 3.4 parent: 91357:a941bb617c2a user: Victor Stinner <victor.stinner@gmail.com> date: Tue Jun 24 22:37:53 2014 +0200 files: Lib/asyncio/futures.py Lib/asyncio/tasks.py Lib/test/test_asyncio/test_base_events.py Lib/test/test_asyncio/test_tasks.py description: asyncio: Log an error if a Task is destroyed while it is still pending changeset: 91360:e1d81c32f13d parent: 91358:3fa0d2b297c6 parent: 91359:978525270264 user: Victor Stinner <victor.stinner@gmail.com> date: Tue Jun 24 22:38:31 2014 +0200 files: Lib/test/test_asyncio/test_tasks.py description: (Merge 3.4) asyncio: Log an error if a Task is destroyed while it is still pending |
|||
msg221508 - (view) | Author: STINNER Victor (vstinner) * | Date: 2014-06-24 22:42 | |
The new check emits a lot of "Task was destroyed but it is pending!" messages when running test_asyncio. I keep the issue open to remember me that I have to fix them. |
|||
msg221573 - (view) | Author: Roundup Robot (python-dev) | Date: 2014-06-25 21:14 | |
New changeset 1088023d971c by Victor Stinner in branch '3.4': Issue #21163, asyncio: Fix some "Task was destroyed but it is pending!" logs in tests http://hg.python.org/cpython/rev/1088023d971c New changeset 7877aab90c61 by Victor Stinner in branch 'default': (Merge 3.4) Issue #21163, asyncio: Fix some "Task was destroyed but it is http://hg.python.org/cpython/rev/7877aab90c61 |
|||
msg221574 - (view) | Author: Roundup Robot (python-dev) | Date: 2014-06-25 21:34 | |
New changeset e9150fdf068a by Victor Stinner in branch '3.4': asyncio: sync with Tulip http://hg.python.org/cpython/rev/e9150fdf068a New changeset d92dc4462d26 by Victor Stinner in branch 'default': (Merge 3.4) asyncio: sync with Tulip http://hg.python.org/cpython/rev/d92dc4462d26 |
|||
msg221578 - (view) | Author: Roundup Robot (python-dev) | Date: 2014-06-25 22:00 | |
New changeset 4e4c6e2ed0c5 by Victor Stinner in branch '3.4': Issue #21163: Fix one more "Task was destroyed but it is pending!" log in tests http://hg.python.org/cpython/rev/4e4c6e2ed0c5 New changeset 24282c6f6019 by Victor Stinner in branch 'default': (Merge 3.4) Issue #21163: Fix one more "Task was destroyed but it is pending!" http://hg.python.org/cpython/rev/24282c6f6019 |
|||
msg221580 - (view) | Author: STINNER Victor (vstinner) * | Date: 2014-06-25 22:27 | |
I fixed the first "Task was destroyed but it is pending!" messages when the fix was simple. Attached dont_log_pending.patch fixes remaining messages when running test_asyncio. I'm not sure yet that this patch is the best approach to fix the issue. Modified functions with example of related tests: * BaseEventLoop.run_until_complete(): don't log because there the method already raises an exception if the future didn't complete ("Event loop stopped before Future completed.") => related test: test_run_until_complete_stopped() of test_events.py * wait(): don't log because the caller doesn't have control on the internal sub-tasks, and the task executing wait() will already emit a message if it is destroyed whereas it didn't completed => related test: test_wait_errors() of test_tasks.py * gather(): same rationale than wait() => related test: test_one_exception() of test_tasks.py * test_utils.run_briefly(): the caller doesn't have access to the task and the function is a best effort approach, it doesn't have to guarantee that running a step of the event loop is until to execute all pending callbacks => related test: test_baseexception_during_cancel() of test_tasks.py |
|||
msg221956 - (view) | Author: STINNER Victor (vstinner) * | Date: 2014-06-30 12:42 | |
Hum, dont_log_pending.patch is not correct for wait(): wait() returns (done, pending), where pending is a set of pending tasks. So it's still possible that pending tasks are destroyed while they are not a still pending, after the end of wait(). The log should not be made quiet here. |
|||
msg221957 - (view) | Author: Roundup Robot (python-dev) | Date: 2014-06-30 12:53 | |
New changeset 13e78b9cf290 by Victor Stinner in branch '3.4': Issue #21163: BaseEventLoop.run_until_complete() and test_utils.run_briefly() http://hg.python.org/cpython/rev/13e78b9cf290 New changeset 2d0fa8f383c8 by Victor Stinner in branch 'default': (Merge 3.4) Issue #21163: BaseEventLoop.run_until_complete() and http://hg.python.org/cpython/rev/2d0fa8f383c8 |
|||
msg222699 - (view) | Author: Roundup Robot (python-dev) | Date: 2014-07-10 20:36 | |
New changeset f13cde63ca73 by Victor Stinner in branch '3.4': asyncio: sync with Tulip http://hg.python.org/cpython/rev/f13cde63ca73 New changeset a67adfaf670b by Victor Stinner in branch 'default': (Merge 3.4) asyncio: sync with Tulip http://hg.python.org/cpython/rev/a67adfaf670b |
|||
msg223230 - (view) | Author: Roundup Robot (python-dev) | Date: 2014-07-16 16:38 | |
New changeset 6d5a76214166 by Victor Stinner in branch '3.4': Issue #21163, asyncio: Ignore "destroy pending task" warnings for private tasks http://hg.python.org/cpython/rev/6d5a76214166 New changeset fbd3e9f635b6 by Victor Stinner in branch 'default': (Merge 3.4) Issue #21163, asyncio: Ignore "destroy pending task" warnings for http://hg.python.org/cpython/rev/fbd3e9f635b6 |
|||
msg223232 - (view) | Author: Roundup Robot (python-dev) | Date: 2014-07-16 16:52 | |
New changeset e4fe6706b7b4 by Victor Stinner in branch '3.4': Issue #21163: Fix "destroy pending task" warning in test_wait_errors() http://hg.python.org/cpython/rev/e4fe6706b7b4 New changeset a627b23f57d4 by Victor Stinner in branch 'default': (Merge 3.4) Issue #21163: Fix "destroy pending task" warning in test_wait_errors() http://hg.python.org/cpython/rev/a627b23f57d4 |
|||
msg223233 - (view) | Author: STINNER Victor (vstinner) * | Date: 2014-07-16 16:57 | |
Ok, I fixed the last warnings emitted in unit tests ran in debug mode. I close the issue. |
|||
msg225493 - (view) | Author: Marco Paolini (mpaolini) * | Date: 2014-08-18 14:20 | |
I finally wrapped my head around this. I wrote a (simpler) script to get a better picture. What happens ------------- When a consumer task is first istantiated, the loop holds a strong reference to it (_ready) Later on, as the loop starts, the consumer task is yielded and it waits on an unreachable future. The last strong ref to it is lost (loop._ready). It is not collected immediately because it just created a reference loop (task -> coroutine -> stack -> future -> task) that will be broken only at task completion. gc.collect() called *before* the tasks are ever run has the weird side effect of moving the automatic gc collection forward in time. Automatic gc triggers after a few (but not all) consumers have become unreachable, depending on how many instructions were executed before running the loop. gc.collect() called after all the consumers are waiting on the unreachable future reaps all consumer tasks as expected. No bug in garbage collection. Yielding from asyncio.sleep() prevents the consumers from being collected: it creates a strong ref to the future in the loop. I suspect also all network-related asyncio coroutines behave this way. Summing up: Tasks that have no strong refs may be garbage collected unexpectedly or not at all, depending on which future they yield to. It is very difficult to debug and undestand why these tasks disappear. Side note: the patches submitted and merged in this issue do emit the relevant warnings when PYTHONASYNCIODEBUG is set. This is very useful. Proposed enhanchements ---------------------- 1. Document that you should always keep strong refs to tasks or to futures/coroutines the tasks yields from. This knowledge is currently passed around the brave asyncio users like oral tradition. 2. Alternatively, keep strong references to all futures that make it through Task._step. We are already keeping strong refs to *some* of the asyncio builtin coroutines (`asyncio.sleep` is one of those). Also, we do keep strong references to tasks that are ready to be run (the ones that simply `yield` or the ones that have not started yet) If you also think 1. or 2. are neeed, let me know and I'll try cook a patch. Sorry for the noise |
|||
msg225497 - (view) | Author: Guido van Rossum (Guido.van.Rossum) | Date: 2014-08-18 16:04 | |
I'm all in favor of documenting that you must keep a strong reference to a task that you want to keep alive. I'm not keen on automatically keep all tasks alive, that might exacerbate leaks (which are by definition hard to find) in existing programs. On Mon, Aug 18, 2014 at 7:20 AM, Marco Paolini <report@bugs.python.org> wrote: > > Marco Paolini added the comment: > > I finally wrapped my head around this. I wrote a (simpler) script to get a > better picture. > > What happens > ------------- > > When a consumer task is first istantiated, the loop holds a strong > reference to it (_ready) > > Later on, as the loop starts, the consumer task is yielded and it waits on > an unreachable future. The last strong ref to it is lost (loop._ready). > > It is not collected immediately because it just created a reference loop > (task -> coroutine -> stack -> future -> task) that will be broken only at > task completion. > > gc.collect() called *before* the tasks are ever run has the weird side > effect of moving the automatic gc collection forward in time. > Automatic gc triggers after a few (but not all) consumers have become > unreachable, depending on how many instructions were executed before > running the loop. > > gc.collect() called after all the consumers are waiting on the unreachable > future reaps all consumer tasks as expected. No bug in garbage collection. > > Yielding from asyncio.sleep() prevents the consumers from being > collected: it creates a strong ref to the future in the loop. > I suspect also all network-related asyncio coroutines behave this way. > > Summing up: Tasks that have no strong refs may be garbage collected > unexpectedly or not at all, depending on which future they yield to. It is > very difficult to debug and undestand why these tasks disappear. > > Side note: the patches submitted and merged in this issue do emit the > relevant warnings when PYTHONASYNCIODEBUG is set. This is very useful. > > Proposed enhanchements > ---------------------- > > 1. Document that you should always keep strong refs to tasks or to > futures/coroutines the tasks yields from. This knowledge is currently > passed around the brave asyncio users like oral tradition. > > 2. Alternatively, keep strong references to all futures that make it > through Task._step. We are already keeping strong refs to *some* of the > asyncio builtin coroutines (`asyncio.sleep` is one of those). Also, we do > keep strong references to tasks that are ready to be run (the ones that > simply `yield` or the ones that have not started yet) > > If you also think 1. or 2. are neeed, let me know and I'll try cook a > patch. > > Sorry for the noise > > ---------- > nosy: +mpaolini > Added file: http://bugs.python.org/file36405/test2.py > > _______________________________________ > Python tracker <report@bugs.python.org> > <http://bugs.python.org/issue21163> > _______________________________________ > |
|||
msg225498 - (view) | Author: Marco Paolini (mpaolini) * | Date: 2014-08-18 16:17 | |
Asking the user to manage strong refs is just passing the potential leak issue outside of the standard library. It doesn't really solve anything. If the user gets the strong refs wrong he can either lose tasks or leak memory. If the standard library gets it right, both issues are avoided. > I'm all in favor of documenting that you must keep a strong reference to a > task that you want to keep alive. I'm not keen on automatically keep all > tasks alive, that might exacerbate leaks (which are by definition hard to find) in existing programs. |
|||
msg225499 - (view) | Author: Guido van Rossum (Guido.van.Rossum) | Date: 2014-08-18 16:22 | |
So you are changing your mind and withdrawing your option #1. I don't have the time to really dig deeply into the example app and what's going on. If you want to help, you can try to come up with a patch (and it should have good unit tests). I'll be on vacation most of this week. On Mon, Aug 18, 2014 at 9:17 AM, Marco Paolini <report@bugs.python.org> wrote: > > Marco Paolini added the comment: > > Asking the user to manage strong refs is just passing the potential > leak issue outside of the standard library. It doesn't really solve > anything. > > If the user gets the strong refs wrong he can either lose tasks or > leak memory. > > If the standard library gets it right, both issues are avoided. > > > I'm all in favor of documenting that you must keep a strong reference to > a > > task that you want to keep alive. I'm not keen on automatically keep all > > tasks alive, that might exacerbate leaks (which are by definition hard to > find) in existing programs. > > ---------- > > _______________________________________ > Python tracker <report@bugs.python.org> > <http://bugs.python.org/issue21163> > _______________________________________ > |
|||
msg225500 - (view) | Author: Marco Paolini (mpaolini) * | Date: 2014-08-18 16:25 | |
> So you are changing your mind and withdrawing your option #1. I think option #1 (tell users to keep strong refs to tasks) is OK but option #2 is better. Yes, I changed my mind ;) |
|||
msg225504 - (view) | Author: Marco Paolini (mpaolini) * | Date: 2014-08-18 19:36 | |
Submitted a first stab at #2. Let me know what you think. If this works we'll have to remove the test_gc_pending test and then maybe even the code that now logs errors when a pending task is gc'ed |
|||
msg225520 - (view) | Author: STINNER Victor (vstinner) * | Date: 2014-08-19 08:02 | |
I don't understand how keeping a strong refrence would fix anything. You only provided one example (async-gc-bug.py) which uses Queue objects but keep weak references to them. Keeping strong references to tasks is not the right fix. You must keep strong references to queues. If a queue is destroyed, how can you put an item into it? Otherwise, the task will wait forever. Keeping a strong refrence to the task just hides the bug. Or I missed something. I dislike the idea of keeping strong references to tasks, it may create even more reference cycles. We already have too many cycles with exceptions stored in futures (in tasks). The current unit test uses low level functions to remove a variable using a frame object. Can you provide an example which shows the bug without using low level functions? |
|||
msg225522 - (view) | Author: Marco Paolini (mpaolini) * | Date: 2014-08-19 09:32 | |
> I don't understand how keeping a strong refrence would fix anything. You > only provided one example (async-gc-bug.py) which uses Queue objects but > keep weak references to them. Keeping strong references to tasks is not the > right fix. You must keep strong references to queues. If a queue is > destroyed, how can you put an item into it? Otherwise, the task will wait > forever. Keeping a strong refrence to the task just hides the bug. Or I > missed something. The original asyncio-gc-issue.py wasn't written by me, and yes, as you say it does have the reference bug you describe. I argue that bug shouldn't cause tasks to die: it should rather limit (as gc proceeds) the number of queues available to the producer in the WeakSet() and leaving alive all consumer waiting on an unreachable queue. Please look at my test2.py or even better test3.py for a simpler example. Note that in my issue_22163_patch_0.diff I only keep strong refs to futures a task is waiting on. Just as asyncio is already doing with asyncio.sleep() coroutine. > I dislike the idea of keeping strong references to tasks, it may create > even more reference cycles. We already have too many cycles with exceptions > stored in futures (in tasks). We are also already keeping strong refs to futures like asyncio.sleep I dislike the idea of randomly losing tasks. I also dislike the idea of forcing the user to manage strong refs to its tasks. All 3rd party libraries will have to invent their own way and it will lead to even more leaks/cycles very hard to debug. Not just exceptions: everytime a task is yielding on a future asyncio creates a reference cycle. > The current unit test uses low level functions to remove a variable using a > frame object. Can you provide an example which shows the bug without using > low level functions? My issue_22163_patch_0.diff only clears references by setting variables to `None`. No low level stuff needed. My test2.py example script also doesn't use any low level stuff I just uploaded test3.py with a simpler (and possibly more realistic) example. |
|||
msg227173 - (view) | Author: Marco Paolini (mpaolini) * | Date: 2014-09-20 20:31 | |
Sorry for keeping this alive. Take a look at the `wait_for.py` just submitted in the unrelated #22448: no strong refs to the tasks are kept. Tasks remain alive only because they are timers and the event loop keeps strong ref. Do you think my proposed patch is OK? Sould I open a new issue? |
|||
msg227176 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2014-09-20 21:14 | |
I'm not sure how that wait_for.py example from issue2116 relates to this issue -- it seems to demonstrate the opposite problem (tasks are kept alive even though they are cancelled). Then again I admit I haven't looked deeply into the example (though I am sympathetic with the issue it purports to demonstrate). |
|||
msg227177 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2014-09-20 21:15 | |
(Whoops meant to link to issue22448.) |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:58:01 | admin | set | github: 65362 |
2014-09-20 21:15:28 | gvanrossum | set | messages: + msg227177 |
2014-09-20 21:14:38 | gvanrossum | set | messages: + msg227176 |
2014-09-20 20:31:46 | mpaolini | set | messages: + msg227173 |
2014-08-19 09:32:10 | mpaolini | set | files:
+ test3.py messages: + msg225522 |
2014-08-19 08:02:38 | vstinner | set | messages: + msg225520 |
2014-08-18 19:36:36 | mpaolini | set | files:
+ issue_22163_patch_0.diff messages: + msg225504 |
2014-08-18 16:25:07 | mpaolini | set | messages: + msg225500 |
2014-08-18 16:22:12 | Guido.van.Rossum | set | messages: + msg225499 |
2014-08-18 16:17:50 | mpaolini | set | messages: + msg225498 |
2014-08-18 16:04:12 | Guido.van.Rossum | set | messages: + msg225497 |
2014-08-18 14:20:58 | mpaolini | set | files:
+ test2.py nosy: + mpaolini messages: + msg225493 |
2014-07-16 16:57:08 | vstinner | set | status: open -> closed resolution: fixed messages: + msg223233 |
2014-07-16 16:52:10 | python-dev | set | messages: + msg223232 |
2014-07-16 16:38:40 | python-dev | set | messages: + msg223230 |
2014-07-10 20:36:52 | python-dev | set | messages: + msg222699 |
2014-06-30 12:53:52 | python-dev | set | messages: + msg221957 |
2014-06-30 12:42:37 | vstinner | set | messages: + msg221956 |
2014-06-25 22:27:11 | vstinner | set | files:
+ dont_log_pending.patch messages: + msg221580 |
2014-06-25 22:00:40 | python-dev | set | messages: + msg221578 |
2014-06-25 21:34:48 | python-dev | set | messages: + msg221574 |
2014-06-25 21:14:10 | python-dev | set | nosy:
+ python-dev messages: + msg221573 |
2014-06-24 22:42:25 | vstinner | set | messages: + msg221508 |
2014-06-24 22:01:49 | vstinner | set | messages: + msg221503 |
2014-06-24 20:29:25 | Richard.Kiss | set | nosy:
+ Richard.Kiss messages: + msg221495 |
2014-06-24 15:48:36 | Guido.van.Rossum | set | nosy:
+ Guido.van.Rossum messages: + msg221470 |
2014-06-24 00:37:39 | yselivanov | set | messages: + msg221387 |
2014-06-23 21:00:18 | vstinner | set | messages: + msg221375 |
2014-06-20 15:16:22 | vstinner | set | messages: + msg221096 |
2014-06-19 18:08:07 | richard.kiss | set | messages: + msg221010 |
2014-06-19 14:49:10 | vstinner | set | title: asyncio task possibly incorrectly garbage collected -> asyncio doesn't warn if a task is destroyed during its execution |
2014-06-19 14:48:38 | vstinner | set | files:
+ log_destroyed_pending_task.patch resolution: remind -> (no value) messages: + msg220986 keywords: + patch |
2014-06-06 11:41:58 | vstinner | set | components: + asyncio |
2014-04-06 03:35:09 | gvanrossum | set | resolution: not a bug -> remind messages: + msg215647 |
2014-04-06 02:12:43 | richard.kiss | set | resolution: not a bug messages: + msg215646 |
2014-04-05 23:37:36 | richard.kiss | set | messages: + msg215642 |
2014-04-05 23:30:44 | gvanrossum | set | messages: + msg215640 |
2014-04-05 23:24:07 | richard.kiss | set | messages: + msg215639 |
2014-04-05 23:05:26 | gvanrossum | set | messages: + msg215638 |
2014-04-05 21:57:25 | brett.cannon | set | nosy:
+ gvanrossum, pitrou, vstinner, giampaolo.rodola, yselivanov |
2014-04-05 21:42:29 | richard.kiss | set | title: asyncio Task Possibly Incorrectly Garbage Collected -> asyncio task possibly incorrectly garbage collected |
2014-04-05 21:10:51 | richard.kiss | set | hgrepos: - hgrepo231 |
2014-04-05 21:10:31 | richard.kiss | create |