msg375355 - (view) |
Author: Yonatan Goldschmidt (Yonatan Goldschmidt) * |
Date: 2020-08-13 23:52 |
I have a construct in my code where I wrap a function's logic with `gc.disable()` to prevent GC from triggering some race condition.
Today this construct got a bit more complicated, and another function had to use this "trick". Now there are 2 flows:
foo():
gc.disable()
....
gc.enable()
bar()
....
gc.disable()
...
# bar calls foo, which also messes with the gc
foo()
gc.disable()
...
gc.enable()
...
gc.enable()
...
I'd expected the GC to be truly enabled only on the second call to `enable()`, but apparently it's enabled on the first call :( Both `gc.enable()` and `gc.disable()` just write `1`/`0` to `gcstate->enabled`, respectively.
For the meantime I wrote a simple wrapper class to do this counting (finally calling `enable()` only when necessary).
Another option is for the code to check first "is GC enabled?" before disabling, and before enabling again, remember whether it really disabled it or not.
But I think those manual workarounds are more error prone, and it's easier to forget to use it them in all sites (compared to using the "right" API from the standard module), so I was considering if we can add an API that encapsulate this counting: an enable-disable pair that makes sure GC is only enabled back when the number of "enable" calls matches the number of "disable" calls.
|
msg375360 - (view) |
Author: Steven D'Aprano (steven.daprano) *  |
Date: 2020-08-14 00:55 |
> I wrap a function's logic with `gc.disable()` to prevent GC from triggering some race condition.
If this race condition is a bug in gc, then we should fix that.
If it is a bug in your code, surely you should fix that rather than disable gc.
Either way, I don't think we should complicate the gc iterface by adding a second way to enable and disable the cyclic gc.
|
msg375376 - (view) |
Author: Yonatan Goldschmidt (Yonatan Goldschmidt) * |
Date: 2020-08-14 08:04 |
> If this race condition is a bug in gc, then we should fix that.
> If it is a bug in your code, surely you should fix that rather than disable gc.
It's neither: it is something related to the combination of some 3rd party libraries I'm using (if you're interested, I have described it here: https://github.com/paramiko/paramiko/issues/1634).
> Either way, I don't think we should complicate the gc iterface by adding a second way to enable and disable the cyclic gc.
I see what you're saying. I wasn't thinking of of this idea as complicating it, I had in mind existing interfaces which have these 2 sets of functions (for example, Linux's local_irq_enable/disable and local_irq_save/restore).
Another approach, only modifying the existing API in a compatible way, will be as follows:
--- a/Modules/gcmodule.c
+++ b/Modules/gcmodule.c
@@ -1489,9 +1489,10 @@ static PyObject *
gc_disable_impl(PyObject *module)
/*[clinic end generated code: output=97d1030f7aa9d279 input=8c2e5a14e800d83b]*/
{
+ int was_enabled = gcstate->enabled
GCState *gcstate = get_gc_state();
gcstate->enabled = 0;
- Py_RETURN_NONE;
+ return was_enabled ? (Py_INCREF(Py_True), Py_True) : (Py_INCREF(Py_False), Py_False);
}
Then I can write code this way:
foo():
disabled = gc.disable()
....
if disabled:
gc.enable()
It can be taken ahead to change `gc.enable()` to `gc.enable(disabled=True)` so I can just call it as `gc.enable(disabled)`:
foo():
disabled = gc.disable()
....
gc.enable(disabled)
|
msg375423 - (view) |
Author: Ammar Askar (ammar2) *  |
Date: 2020-08-14 18:29 |
> I'd expected the GC to be truly enabled only on the second call to `enable()`, but apparently it's enabled on the first call :( Both `gc.enable()` and `gc.disable()` just write `1`/`0` to `gcstate->enabled`, respectively.
I don't think this API makes much sense. I would expect `enable()` to enable the GC regardless of how many disable calls have been made. This is also in line with other functions like `faulthandler.enable()` / `bdb.Breakpoint.enable()`
I think the proper solution if you want to do this would be to wrap the disabling/enabling and keep track of the number of calls yourself.
|
msg375450 - (view) |
Author: Dennis Sweeney (Dennis Sweeney) *  |
Date: 2020-08-15 07:05 |
This is exactly the motivation for context managers, no? I attached no_gc.py, which works when nested and should additionally be thread-safe. Usage:
from no_gc import no_gc
with no_gc():
# collection disabled
with no_gc():
# collection is still disabled
# collection is still disabled
# collection is finally re-enabled here
|
msg375497 - (view) |
Author: Irit Katriel (iritkatriel) *  |
Date: 2020-08-15 22:43 |
There is also gc.isenabled(), so couldn't you check that before disabling and remember whether you needed to disable or not?
|
msg375543 - (view) |
Author: Yonatan Goldschmidt (Yonatan Goldschmidt) * |
Date: 2020-08-17 12:20 |
> This is exactly the motivation for context managers, no? I attached no_gc.py, which works when nested and should additionally be thread-safe.
My solution was roughly the same (also a context manager, but a bit simplified because I didn't need threading support so I didn't bother with locking).
> There is also gc.isenabled(), so couldn't you check that before disabling and remember whether you needed to disable or not?
Yes, but it must be protected like Dennis suggested, otherwise it can't be used in a race-free way, for example this snippet is susceptible to a thread switch between the `isenabled()` and `disable()` calls (so another thread could meanwhile disable GC, and we retain a stale `was_enabled` result)
was_enabled = gc.isenabled()
gc.disable()
...
if was_enabled:
gc.enable()
My points in this issue are:
1. I think that such a safer API should be available in the standard library (I don't want to find myself repeating this across different projects). I think that wherever you find yourself using `gc.disable()` you should actually be using a safer API (that takes into account multithreading & previous invocations of `gc.disable()`)
2. A tiny change in `gc.enable()` (as I suggested in msg375376) can make it substantially easier for the Python side to protect itself, because it will be race-free without any locks.
|
msg375557 - (view) |
Author: Dennis Sweeney (Dennis Sweeney) *  |
Date: 2020-08-17 16:21 |
The save-a-boolean-for-each-context-manager approach has an issue if used with concurrent generators, where the lifetimes of two generator objects might be overlapping but not completely nested, as shown below. The same issue should arise when using multiple threads. The approach in no_gc.py with counting the times_disabled does not run into the same issue.
>>> from contextlib import contextmanager
>>> import gc
>>> @contextmanager
def no_gc():
was_enabled = gc.isenabled()
try:
yield
finally:
if was_enabled:
gc.enable()
>>> def gen1():
with no_gc():
yield "a"
yield "b"
yield "c"
>>> def gen2():
with no_gc():
yield 1
yield 2
yield 3
>>>
>>> g1 = gen1()
>>> g2 = gen2()
>>> next(g1)
'a'
>>> next(g2)
1
>>> list(g1)
['b', 'c']
>>> gc.isenabled() # should be False!
True
>>> list(g2)
[2, 3]
|
msg375558 - (view) |
Author: Dennis Sweeney (Dennis Sweeney) *  |
Date: 2020-08-17 16:23 |
FWIW I forgot the gc.disable() line in the contextmanager, but what I said still applies.
|
msg375568 - (view) |
Author: Yonatan Goldschmidt (Yonatan Goldschmidt) * |
Date: 2020-08-17 21:37 |
Hmm... I didn't think of overlapping lock periods. You are right, Dennis, a counter must be managed.
|
msg385212 - (view) |
Author: Dennis Sweeney (Dennis Sweeney) *  |
Date: 2021-01-18 18:17 |
It looks like this was a duplicate of https://bugs.python.org/issue31356
|
msg385411 - (view) |
Author: Yonatan Goldschmidt (Yonatan Goldschmidt) * |
Date: 2021-01-21 11:54 |
Dennis, you're right. I guess I missed it when I previously searched for matching issues. https://bugs.python.org/issue31356 indeed solves my problem. Closing this as a duplicate.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:59:34 | admin | set | github: 85717 |
2021-01-21 11:54:56 | Yonatan Goldschmidt | set | status: open -> closed resolution: duplicate messages:
+ msg385411
stage: resolved |
2021-01-18 18:17:24 | Dennis Sweeney | set | messages:
+ msg385212 |
2020-08-17 21:37:25 | Yonatan Goldschmidt | set | messages:
+ msg375568 |
2020-08-17 16:23:53 | Dennis Sweeney | set | messages:
+ msg375558 |
2020-08-17 16:21:21 | Dennis Sweeney | set | messages:
+ msg375557 |
2020-08-17 12:20:57 | Yonatan Goldschmidt | set | messages:
+ msg375543 |
2020-08-15 22:43:46 | iritkatriel | set | nosy:
+ iritkatriel messages:
+ msg375497
|
2020-08-15 07:05:20 | Dennis Sweeney | set | files:
+ no_gc.py nosy:
+ Dennis Sweeney messages:
+ msg375450
|
2020-08-14 18:29:37 | ammar2 | set | nosy:
+ ammar2 messages:
+ msg375423
|
2020-08-14 08:04:36 | Yonatan Goldschmidt | set | messages:
+ msg375376 |
2020-08-14 00:55:12 | steven.daprano | set | nosy:
+ steven.daprano messages:
+ msg375360
|
2020-08-13 23:52:21 | Yonatan Goldschmidt | create | |