This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: gc API requiring matching number of gc.disable - gc.enable calls
Type: enhancement Stage: resolved
Components: Interpreter Core Versions: Python 3.10
process
Status: closed Resolution: duplicate
Dependencies: Superseder:
Assigned To: Nosy List: Dennis Sweeney, Yonatan Goldschmidt, ammar2, iritkatriel, steven.daprano
Priority: normal Keywords:

Created on 2020-08-13 23:52 by Yonatan Goldschmidt, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
no_gc.py Dennis Sweeney, 2020-08-15 07:05 Context manager for temporarily disabling the cyclic GC
Messages (12)
msg375355 - (view) Author: Yonatan Goldschmidt (Yonatan Goldschmidt) * Date: 2020-08-13 23:52
I have a construct in my code where I wrap a function's logic with `gc.disable()` to prevent GC from triggering some race condition.

Today this construct got a bit more complicated, and another function had to use this "trick". Now there are 2 flows:

foo():
    gc.disable()
    ....
    gc.enable()

bar()
    ....
    gc.disable()
    ...
    # bar calls foo, which also messes with the gc
    foo()
        gc.disable()
        ...
        gc.enable()
    ...
    gc.enable()
    ...

I'd expected the GC to be truly enabled only on the second call to `enable()`, but apparently it's enabled on the first call :( Both `gc.enable()` and `gc.disable()` just write `1`/`0` to `gcstate->enabled`, respectively.

For the meantime I wrote a simple wrapper class to do this counting (finally calling `enable()` only when necessary).

Another option is for the code to check first "is GC enabled?" before disabling, and before enabling again, remember whether it really disabled it or not.

But I think those manual workarounds are more error prone, and it's easier to forget to use it them in all sites (compared to using the "right" API from the standard module), so I was considering if we can add an API that encapsulate this counting: an enable-disable pair that makes sure GC is only enabled back when the number of "enable" calls matches the number of "disable" calls.
msg375360 - (view) Author: Steven D'Aprano (steven.daprano) * (Python committer) Date: 2020-08-14 00:55
> I wrap a function's logic with `gc.disable()` to prevent GC from triggering some race condition.

If this race condition is a bug in gc, then we should fix that.

If it is a bug in your code, surely you should fix that rather than disable gc.

Either way, I don't think we should complicate the gc iterface by adding a second way to enable and disable the cyclic gc.
msg375376 - (view) Author: Yonatan Goldschmidt (Yonatan Goldschmidt) * Date: 2020-08-14 08:04
> If this race condition is a bug in gc, then we should fix that.

> If it is a bug in your code, surely you should fix that rather than disable gc.

It's neither: it is something related to the combination of some 3rd party libraries I'm using (if you're interested, I have described it here: https://github.com/paramiko/paramiko/issues/1634).

> Either way, I don't think we should complicate the gc iterface by adding a second way to enable and disable the cyclic gc.

I see what you're saying. I wasn't thinking of of this idea as complicating it, I had in mind existing interfaces which have these 2 sets of functions (for example, Linux's local_irq_enable/disable and local_irq_save/restore).

Another approach, only modifying the existing API in a compatible way, will be as follows:

    --- a/Modules/gcmodule.c
    +++ b/Modules/gcmodule.c
    @@ -1489,9 +1489,10 @@ static PyObject *
     gc_disable_impl(PyObject *module)
     /*[clinic end generated code: output=97d1030f7aa9d279 input=8c2e5a14e800d83b]*/
     {
    +    int was_enabled = gcstate->enabled
         GCState *gcstate = get_gc_state();
         gcstate->enabled = 0;
    -    Py_RETURN_NONE;
    +    return was_enabled ? (Py_INCREF(Py_True), Py_True) : (Py_INCREF(Py_False), Py_False);
     }

Then I can write code this way:

    foo():
        disabled = gc.disable()
        ....
        if disabled:
             gc.enable()

It can be taken ahead to change `gc.enable()` to `gc.enable(disabled=True)` so I can just call it as `gc.enable(disabled)`:

    foo():
        disabled = gc.disable()
        ....
        gc.enable(disabled)
msg375423 - (view) Author: Ammar Askar (ammar2) * (Python committer) Date: 2020-08-14 18:29
> I'd expected the GC to be truly enabled only on the second call to `enable()`, but apparently it's enabled on the first call :( Both `gc.enable()` and `gc.disable()` just write `1`/`0` to `gcstate->enabled`, respectively.

I don't think this API makes much sense. I would expect `enable()` to enable the GC regardless of how many disable calls have been made. This is also in line with other functions like `faulthandler.enable()` / `bdb.Breakpoint.enable()`

I think the proper solution if you want to do this would be to wrap the disabling/enabling and keep track of the number of calls yourself.
msg375450 - (view) Author: Dennis Sweeney (Dennis Sweeney) * (Python committer) Date: 2020-08-15 07:05
This is exactly the motivation for context managers, no? I attached no_gc.py, which works when nested and should additionally be thread-safe. Usage:

from no_gc import no_gc

with no_gc():
    # collection disabled
    with no_gc():
        # collection is still disabled
    # collection is still disabled
# collection is finally re-enabled here
msg375497 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2020-08-15 22:43
There is also gc.isenabled(), so couldn't you check that before disabling and remember whether you needed to disable or not?
msg375543 - (view) Author: Yonatan Goldschmidt (Yonatan Goldschmidt) * Date: 2020-08-17 12:20
> This is exactly the motivation for context managers, no? I attached no_gc.py, which works when nested and should additionally be thread-safe.

My solution was roughly the same (also a context manager, but a bit simplified because I didn't need threading support so I didn't bother with locking).

> There is also gc.isenabled(), so couldn't you check that before disabling and remember whether you needed to disable or not?

Yes, but it must be protected like Dennis suggested, otherwise it can't be used in a race-free way, for example this snippet is susceptible to a thread switch between the `isenabled()` and `disable()` calls (so another thread could meanwhile disable GC, and we retain a stale `was_enabled` result)

     was_enabled = gc.isenabled()
     gc.disable()
     ...
     if was_enabled:
         gc.enable()

My points in this issue are:

1. I think that such a safer API should be available in the standard library (I don't want to find myself repeating this across different projects). I think that wherever you find yourself using `gc.disable()` you should actually be using a safer API (that takes into account multithreading & previous invocations of `gc.disable()`)
2. A tiny change in `gc.enable()` (as I suggested in msg375376) can make it substantially easier for the Python side to protect itself, because it will be race-free without any locks.
msg375557 - (view) Author: Dennis Sweeney (Dennis Sweeney) * (Python committer) Date: 2020-08-17 16:21
The save-a-boolean-for-each-context-manager approach has an issue if used with concurrent generators, where the lifetimes of two generator objects might be overlapping but not completely nested, as shown below. The same issue should arise when using multiple threads. The approach in no_gc.py with counting the times_disabled does not run into the same issue.

>>> from contextlib import contextmanager
>>> import gc
>>> @contextmanager
def no_gc():
	was_enabled = gc.isenabled()
	try:
		yield
	finally:
		if was_enabled:
			gc.enable()

			
>>> def gen1():
	with no_gc():
		yield "a"
		yield "b"
		yield "c"

		
>>> def gen2():
	with no_gc():
		yield 1
		yield 2
		yield 3

		
>>> 
>>> g1 = gen1()
>>> g2 = gen2()
>>> next(g1)
'a'
>>> next(g2)
1
>>> list(g1)
['b', 'c']
>>> gc.isenabled() # should be False!
True
>>> list(g2)
[2, 3]
msg375558 - (view) Author: Dennis Sweeney (Dennis Sweeney) * (Python committer) Date: 2020-08-17 16:23
FWIW I forgot the gc.disable() line in the contextmanager, but what I said still applies.
msg375568 - (view) Author: Yonatan Goldschmidt (Yonatan Goldschmidt) * Date: 2020-08-17 21:37
Hmm... I didn't think of overlapping lock periods. You are right, Dennis, a counter must be managed.
msg385212 - (view) Author: Dennis Sweeney (Dennis Sweeney) * (Python committer) Date: 2021-01-18 18:17
It looks like this was a duplicate of https://bugs.python.org/issue31356
msg385411 - (view) Author: Yonatan Goldschmidt (Yonatan Goldschmidt) * Date: 2021-01-21 11:54
Dennis, you're right. I guess I missed it when I previously searched for matching issues. https://bugs.python.org/issue31356 indeed solves my problem. Closing this as a duplicate.
History
Date User Action Args
2022-04-11 14:59:34adminsetgithub: 85717
2021-01-21 11:54:56Yonatan Goldschmidtsetstatus: open -> closed
resolution: duplicate
messages: + msg385411

stage: resolved
2021-01-18 18:17:24Dennis Sweeneysetmessages: + msg385212
2020-08-17 21:37:25Yonatan Goldschmidtsetmessages: + msg375568
2020-08-17 16:23:53Dennis Sweeneysetmessages: + msg375558
2020-08-17 16:21:21Dennis Sweeneysetmessages: + msg375557
2020-08-17 12:20:57Yonatan Goldschmidtsetmessages: + msg375543
2020-08-15 22:43:46iritkatrielsetnosy: + iritkatriel
messages: + msg375497
2020-08-15 07:05:20Dennis Sweeneysetfiles: + no_gc.py
nosy: + Dennis Sweeney
messages: + msg375450

2020-08-14 18:29:37ammar2setnosy: + ammar2
messages: + msg375423
2020-08-14 08:04:36Yonatan Goldschmidtsetmessages: + msg375376
2020-08-14 00:55:12steven.dapranosetnosy: + steven.daprano
messages: + msg375360
2020-08-13 23:52:21Yonatan Goldschmidtcreate