New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add context manager to temporarily disable GC #75537
Comments
Used like this: with gc_disabled():
run_some_timing()
with gc_disabled():
# do_something_that_has_real_time_guarantees
# such as a pair trade, robotic braking, etc This would be implemented in C for atomicity and speed. It would be roughly equivalent to: @contextmanager
def gc_disabled():
if gc.isenabled():
gc.disable()
yield
gc.enable()
else:
yield |
+1 from me for the general idea. As far as where to put it goes, I think the |
Note that threads can break this. Even without calling gc.enable() explicitly, "with gc_disabled()" in different thread can enable GC inside other "with gc_disabled()" block. |
Yes, this will be in the same category as the standard stream redirection context managers - multi-threaded applications either won't be able to use it, or else will need to manage access to it somehow. |
I believe Raymond is aware of the thread issue. We discussed it. If gc.disable() would return the previous state of the gc instead of None and an API to enable based on a passed in bool, both of which are written in C and executed with the GIL (or merely another lock protecting changing the gc state rather than the GIL) held and this Python would works with multiple threads all fighting to control the gc state: @contextmanager
def gc_disabled():
previous_enable_state = gc.disable()
yield
gc.set_enable(previous_enable_state) But we don't need to modify gc.disable's return value or add a set_enable() if gc_disabled() itself is not implemented in Python. That's up to the implementer. |
I'd be -1 on this without a demonstrated broad need for this in at least some context outside of microbenchmarking utilities (which presumably have already implemented similar stuff). If a minimum bar for applicability isn't applied, we'll end up with dozens of these special purpose managers justified by the last limited applicability one. Stuff like contextlib.closing is justifiable (though I wish it allowed you to replace the method name called, so it could call terminate, kill, release, what-have-you) since cleanup close is so common, but disabling and reenabling gc is usually for timing purposes, and there aren't *that* many tools that need it. It doesn't seem all that useful for real time purposes either; sure, it disables gc, but it doesn't disable other forms of implicit non-local code execution that are surprisingly hard to predict (e.g. object destruction, including __del__, for non-cyclic cases is going to depend on whether the stuff being decref-ed is still owned outside the block). Disabling GC means a lot in Java, because *all* cleanup is GC; in Python, it's just cycle collection, so you're not giving much in the way of guarantees, as the code in question has to be written with a *lot* of assumptions that Python usually can't support (it's hardly a realtime language). Seems like if you want a speed up and reliability for this case (and all other context managers intended to be low overhead), it would be better to move parts of contextlib to the C layer (e.g. contextmanager and the classes it's implemented in terms of), so the simple and correct throwaway implementations for arbitrary custom use cases are fast enough; atomicity clearly doesn't actually matter here (it can't be made atomic in any meaningful sense given the lack of thread safety), so any real problem with the provided implementations (assuming they had try/finally added to make them robust) is overhead, not atomicity; the code posted so far would be fine (after adding try/finally) without the speed issues. |
This is being proposed for the gc module, not contextlib, so I'm not sure how the previous comment applies. This is just a convenience API for folks already doing this kind of manipulation themselves. |
Assigning this to Lisa for a C implementation, docs, and tests. |
I have prepared a PR in GitHub with an initial implementation of the context manager trying to fulfil the discussed requirements: #3980 |
I just realize that the link I provided is incorrect. This is the correct one (also is the one that appears in this issue anyway): |
What is the name of the context manager? gc_disabled, disabled, or Disabled? I see different names in the PR and this confuses me. |
Sorry about that. The context manager is "gc.Disabled()", which I admit is probably a bad name. The documentation is an example of the "equivalent" Python code as stated by Raymond in the first comment but I see now that this will raise confusion. Please, can you indicate what will be a correct name for the context manager so I can update the PR? |
Just to clarify the current situation: At this point, the contextmanager is referred as "disabled" in the C code but is exported as "Disabled" to the garbage collection module. The "gc_disabled" is the Python "equivalent" given by Raymond that is included in the documentation |
Given the existing "gc.enable", "gc.disable" and "gc.isenabled" APIs, I'd suggest calling this one "gc.ensure_disabled": it ensures the GC is disabled in the with statement body, but only turns it back on at the end if it was previously enabled. That same name would then be used all the way through the code and documentation. |
A bug found by coverity: (PyErr_WarnEx might error out; please update the code to handle that) ________________________________________________________________________________________________________ *** CID 1428756: Error handling issues (CHECKED_RETURN) |
Lisa, would you like to take a crack at fixing this one? |
I gave it a shot- looks like we need to ensure that we catch any errors that could be thrown by the warning itself. I wasn't entirely sure how to create a test for this, if anyone knows how I'll definitely add it! |
@lisroach A possible test for this is repeat warnings.filterwarnings('error') and then checking for a ======================================================================
ERROR: test_ensure_disabled_thread (Lib.test.test_gc.GCTogglingTests)
----------------------------------------------------------------------
RuntimeWarning: Garbage collector enabled while another thread is inside gc.ensure_enabled
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/pgalindo3/cpython/Lib/test/support/__init__.py", line 2083, in decorator
return func(*args)
File "/home/pgalindo3/cpython/Lib/test/test_gc.py", line 1063, in test_ensure_disabled_thread
inside_status_after_thread = gc.isenabled()
SystemError: <built-in method __exit__ of gc.ensure_disabled object at 0x7f0253d22de0> returned a result with an error se Another posible test is checking that SystemError is not raised / in stderr. |
Actually raising an exception in PyErr_WarnEx() is not an error, but a standard behavior with corresponding configuration. For now running tests with -We in debug build crashes. $ ./python -We -m test test_gc
Run tests sequentially
0:00:00 load avg: 0.33 [1/1] test_gc
Fatal Python error: a function returned a result with an error set
RuntimeWarning: Garbage collector enabled while another thread is inside gc.ensure_enabled The above exception was the direct cause of the following exception: SystemError: <built-in method __exit__ of gc.ensure_disabled object at 0x7fb1b62a08f8> returned a result with an error set Thread 0x00007fb1b1773700 (most recent call first): Current thread 0x00007fb1b824f040 (most recent call first): |
The idea which this issue represents is not rejected. It is a good one, we found a need for it during the dev sprint last September. |
Well, not everybody thinks it is a good one. I, for instance, don't think it's a good idea, so it is at least one "-1". I saw Serhiy was unsure about this new feature too, so maybe there are two "-1"s; I don't know. So I kindly ask(-ed) for it to be openly discussed on python-dev, instead of making a unilateral decision. The problem with the current design is that GC can become enabled inside the 'with' block at any time: with gc.ensure_disabled():
# gc.is_enabled() might be True ! The GC can become enabled from another OS thread, as we have one GC per process. Adding a locking mechanism might be tricky in terms of implementation, and risky in terms of allowing people to accidentally have a deadlock or something. In short, I don't see any way to make this context manager to work reliably in CPython. Therefore, I think that the best location for a helper like this would be some unittesting helpers collection, as it can work only under some very specific conditions. The core 'gc' module is currently a collection of low-level primitives. I think we need a very solid motivation to add a core GC function that works unreliably and is suitable only for unittesting (at best). Maybe I'm completely wrong here, in which case I would love to be proved wrong and not just ignored. |
I concur with Yury in every point. The idea looks good for three core developers, but I just don't understand this. This feature looks useless and misleading to me. It is not atomic and doesn't ensure anything, despite its name. If it will be added in the stdlib, there should be very good explanation of its purpose and limitations in the documentation. And I agree that unittest may be better place than gc if the purpose of it is testing. |
Indeed if unit testing is the main use case, I don't really see the point of adding C code. People can code a simple helper using |
Nick and Raymond: do you remember what our original motivating use cases were for this idea during the sprint? |
If I recall the discussion correctly, it was:
Either of those two options requires changes to the main GC machinery though, as otherwise you basically *can't* write a correct context manager for this use case, since a direct call to gc.enable() in another thread would always be problematic. |
For truly disabling GC globally you need to use a counted semaphore or other synchronization primitives, and this can be implemented at Python level. But what are use cases for this context manager? Isn't naive approach enough? |
https://bugs.python.org/issue41545 is a duplicate of this. In that report, there's an example of something that can go wrong with the save-a-boolean-per-context-manager approach even when threads are not used, but when concurrent generators are used, like def gen():
with gc_disabled():
yield 1
yield 2
yield 3
a = gen()
b = gen()
next(a) # disable gc
next(b)
list(a) # this re-enableds the gc
next(b) # gc is enabled but shouldn't be. A counter for "number of times disabled" may be a better approach. |
I've taken a shot making this in pure Python, and took into account a few more factors such as starvation and reentrancy. As Nick said, one of the only ways it can truly work is if it's completely global, in which case we don't need to use the internal lock and relying on the GIL would work. It's not very efficient atm, as the overhead of the Lock() and https://github.com/bharel/disable_gc I'll release it to pypi sometime soon and we can take a look at the not-so-accurate usage metric to see if such a thing is needed in the standard library. |
The discussion shows there is no consensus on adding this feature. Marking as pending with a view to closing this (unless someone objects). |
I don't think "non-consensus" is a fair reading. We made an agreed to commit, but reverted it only because there was a minor issue with it. The issue is stale because no one has worked on fixing it, but it is still valid. The proposal would work fine unless called in multiple threads, so it would need to be documented as not being threadsafe. It is no more or less powerful that the current way we do it with a |
There were -1s from @1st1 and @serhiy-storchaka, which I interpreted as non-consensus. @ncoghlan wrote that there's no point in doing this unless it's done in a thread-safe way. |
I do think this would have a value if implemented in a thread-safe way, therefore this issue could probably be kept open. |
It may be useful to compare the proposal to FWIW, there are reasons that people use a |
I mostly agreed with the sentiments in that selective quote. If we were to expand the gc API we need to make these expectations (really "don't expect any of this to reliably ever do what you want") clear in our The difference between Adding a context managing variant of That leaves me as -0 on bothering with this even though I'd have found the idea convenient in some tests. Even when I would've reached for it, it was probably still the wrong tool. |
This may be getting OT for this issue, but given how important GC threshold tuning (including, in some cases, disabling automatic GC and only calling |
I'm channeling my ideal world of perfectly spherical theoretical python applications and implementations there... YouTube did similar gc tweaking. |
I agree, in the ideal world of perfectly spherical Python, the GC would be so fast and so perfectly self-tuning that no user code would ever have reason to touch it! |
I'm not sure what you're alluding to here. Perhaps we're having a different model of when people might want to selectively disable the GC? My experience is that - apart from uncommon unit tests - the desire to disable the GC comes when particular performance issues are encountered on long-running applications with a large resident working set and very dynamic allocation spikes. Those applications have a high likelihood of being multi-threaded. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: