classification
Title: release arguments of contextmanager
Type: resource usage Stage: resolved
Components: Library (Lib) Versions: Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: ncoghlan Nosy List: Martin.Teichmann, ncoghlan, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2017-05-08 14:04 by Martin.Teichmann, last changed 2018-01-28 05:15 by ncoghlan. This issue is now closed.

Files
File name Uploaded Description Edit
refactor-contextmanager.diff serhiy.storchaka, 2017-05-09 09:12
Pull Requests
URL Status Linked Edit
PR 1500 merged Martin.Teichmann, 2017-05-08 14:18
PR 5374 merged ncoghlan, 2018-01-28 04:20
Messages (12)
msg293234 - (view) Author: Martin Teichmann (Martin.Teichmann) * Date: 2017-05-08 14:04
The arguments of a function which was decorated to be a context manager are stored inside the context manager, and are thus kept alive.

This is a memory leak.

Example:

    @contextmanager
    def f(a):
        do_something_with(a)
        a = None  # should release the memory
        yield

if this is now called with something big, say

    with f(something_really_huge):
        pass

then this something_really_huge is kept alive during the with statement, even though the function explicitly let go of it.
msg293274 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-05-09 03:22
Nice catch. This behaviour is an artifact of the ContextDecorator support in _GeneratorContextManager, where context managers created with `@contextmanager` can be used as function decorators (implicitly wrapping them in a with statement) as well as directly in with statements: to handle the ContextDecorator case, the original arguments are stored on the _GeneratorContextManager instance, which has the side effect of keeping those arguments referenced for the whole of the with statement body (due to the reference to the instance from the bound __exit__ method).

Martin's initial patch just unconditionally deletes those attributes in __enter__, which is technically correct, but *looks* wrong when reading the code (understanding why it's correct requires understanding the private _recreate_cm API used for collaboration between ContextDecorator and _GeneratorContextManager, and the fact that that relationship is a bit weird and convoluted is the main reason it's private in the first place).

My recommendation on the PR for a more self-obviously correct fix is to do the following:

- add a new "allow_recreation=True" flag parameter to `_GeneratorContextManager.__init__` that controls whether or not the CM recreation attributes get set or not
- pass `False` for that argument in `_recreate_cm` so they get set to None instead
- update `__enter__` to set them to None if they're not already None

The practical effect is the same as Martin's original patch, but the more explicit version should make it easier for readers to see what is going on (i.e. even when recreation is allowed during construction, the intent is that for any given instance, you will only ever call either __enter__() *or* _recreate_cm(), and once you do call __enter__(), that's it - the only permitted call after that point on that particular instance is __exit__()).
msg293294 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-05-09 09:12
Wouldn't be better to split _GeneratorContextManager on two classes rather than add a boolean argument? Currently _GeneratorContextManager used  two different functions -- an one-shot context manager and a decorator that recreates context managers.

Proposed patch makes _GeneratorContextManager an one-shot context manager again (it no longer keep references to arguments), and adds _GeneratorContextManagerDecorator which can be used either as an one-shot context manager (references to arguments are removed after use) or as a decorator (in the last case the gen attribute is not created).
msg293299 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-05-09 11:09
You're right, there's no actual requirement that _recreate_cm() call self.__class__, so it can use a different type internally.
msg293382 - (view) Author: Martin Teichmann (Martin.Teichmann) * Date: 2017-05-10 06:29
I personally prefer the current situation. The problem is the following: when an @contextmanager is first called, we don't know yet whether the user wants to use it directly, or as a decorator, so we have to have some kind of hybrid class. Once it's used as a decorator, we need to recreate a context manager every time the decorate function is called. Then we need exactly the direct context manager mentioned above. This is why we recreate it with __class__.

In my MR on github I simplify that by not recreating the entire context manager, but only the generator function. I think this is the most clear way, as it shows what's going on in a confined area of code.
msg293393 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-05-10 09:10
Noting something I picked up in the PR review that apparently isn't covered by the test suite: the whole _recreate_cm() dance is necessary to make recursive and concurrent invocation of decorated functions work correctly. Otherwise the original CM instance ends up being shared between different active calls and problems start to arise.

It's these kinds of subtleties that prompted me to declare trying to make otherwise one-shot CMs usable with ContextDecorator an inherently flawed design and hence opt out of making it a public API: https://bugs.python.org/issue11647#msg161113

However, we do still need to keep it working reliably for the @contextmanager case - it's just going to be inherently messy at the implementation level because the original design decision to support the hybrid usage model turned out to be a questionable one.
msg293847 - (view) Author: Martin Teichmann (Martin.Teichmann) * Date: 2017-05-17 07:14
Hi, so where do we go from here? I could write the proposed allow_recreation flag, but IMHO this adds only dead code, as it is of no need at all. I don't think code gets more clear by adding dead code...

I opted for one line more of comment, I think that helps much more. Plus the additional tests I wrote in the PR I think it would be good in the Python core.
msg293848 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-05-17 10:47
Re-reading my previous comment, I now realise it wasn't particularly clear.

To be more explicit, I agree with Serhiy that since the internals of how @contextmanager works are a private API, it makes sense to switch to a two-class design that doesn't rely on the _recreate_cm() mechanism at all, and instead relies on overriding `__call__()` for the decorator use case.

At runtime, the end result is pretty similar to your current implementation, but the split into two classes more clearly separates the base "use a generator as a context manager" functionality from the "implicitly wrap a generator around a functional call" feature.

It makes sense to do that, since the current self.__class__ based design stems from the time when I was considering making this a public API (and hence needed to account for inheritance), and I didn't go back and review it for possible refactoring opportunities when I decided not to do that.
msg310912 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-01-28 04:03
Ouch, this clearly slipped off my review radar last year - I just picked it up again now while going through all my currently assigned issues before 3.7b1.

While I still think the suggested refactoring above would be a good way to go, Martin's single-line fix in the PR passes all the new test cases, and solves the reported problem, so I'm going to add a NEWS entry and then merge it.
msg310913 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-01-28 04:17
New changeset dd0e087edc8f1e4d2c0913236b1a62a77d9db6d8 by Nick Coghlan (Martin Teichmann) in branch 'master':
bpo-30306: release arguments of contextmanager (GH-1500)
https://github.com/python/cpython/commit/dd0e087edc8f1e4d2c0913236b1a62a77d9db6d8
msg310921 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-01-28 05:14
New changeset a278ad2faa76ae61569a58f36e06ba8745f4a9b7 by Nick Coghlan in branch 'master':
bpo-30306: Add missing NEWS entry (GH-5374)
https://github.com/python/cpython/commit/a278ad2faa76ae61569a58f36e06ba8745f4a9b7
msg310922 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-01-28 05:15
Thanks for the issue report and patch Martin, and sorry for the long delay in getting it merged!
History
Date User Action Args
2018-01-28 05:15:17ncoghlansetstatus: open -> closed
resolution: fixed
messages: + msg310922

stage: patch review -> resolved
2018-01-28 05:14:32ncoghlansetmessages: + msg310921
2018-01-28 04:20:55ncoghlansetstage: patch review
pull_requests: + pull_request5215
2018-01-28 04:17:49ncoghlansetmessages: + msg310913
2018-01-28 04:03:32ncoghlansetmessages: + msg310912
2017-05-17 10:47:43ncoghlansetmessages: + msg293848
2017-05-17 07:14:18Martin.Teichmannsetmessages: + msg293847
2017-05-10 09:10:11ncoghlansetmessages: + msg293393
2017-05-10 06:29:38Martin.Teichmannsetmessages: + msg293382
2017-05-09 11:09:39ncoghlansetmessages: + msg293299
2017-05-09 09:12:24serhiy.storchakasetfiles: + refactor-contextmanager.diff

nosy: + serhiy.storchaka
messages: + msg293294

keywords: + patch
2017-05-09 03:22:43ncoghlansetmessages: + msg293274
2017-05-08 14:18:51Martin.Teichmannsetpull_requests: + pull_request1602
2017-05-08 14:18:20rhettingersetassignee: ncoghlan

nosy: + ncoghlan
2017-05-08 14:04:41Martin.Teichmanncreate