msg362846 - (view) |
Author: Leonard Lausen (leezu) |
Date: 2020-02-27 21:52 |
Below sample program, will raise "Modules/gcmodule.c:110: gc_decref: Assertion "gc_get_refs(g) > 0" failed: refcount is too small" on Python 3.8.2 debug build.
On 3.7.6 debug build, "Modules/gcmodule.c:277: visit_decref: Assertion `_PyGCHead_REFS(gc) != 0' failed." is raised.
```
import collections
import gc
import weakref
hooks_dict = collections.OrderedDict()
hooks_dict_ref = weakref.ref(hooks_dict)
gc.collect()
print('Hello world')
```
The complete error message on 3.8.2 debug build is
```
Modules/gcmodule.c:110: gc_decref: Assertion "gc_get_refs(g) > 0" failed: refcount is too small
Memory block allocated at (most recent call first):
File "/home/$USER/test.py", line 6
object : <weakref at 0x7ff788208a70; to 'collections.OrderedDict' at 0x7ff7881fab90>
type : weakref
refcount: 1
address : 0x7ff788208a70
Fatal Python error: _PyObject_AssertFailed
Python runtime state: initialized
Current thread 0x00007ff789f9c080 (most recent call first):
File "/home/$USER/test.py", line 7 in <module>
zsh: abort PYTHONTRACEMALLOC=1 python ~/test.py
```
|
msg362862 - (view) |
Author: Tim Peters (tim.peters) * |
Date: 2020-02-28 03:38 |
Thanks for the succinct example! Although you don't need the print() ;-)
I can confirm this crashes the same way under a current master build (albeit on Windows 64-bit).
gc is calculating how many references in the current generation are accounted for by intra-generation references, and gc's visit_decref() is getting called back by odictobject.c's odict_traverse(), on line
Py_VISIT(od->od_weakreflist);
gc has found a second pointer to od->od_weakreflist, which is more than its refcount (1) claims exist.
Python's weakref implementation gives me headaches, so I'm adding Raymond to the nosy list under the hope the problem will be obvious to him.
|
msg362866 - (view) |
Author: Tim Peters (tim.peters) * |
Date: 2020-02-28 05:53 |
I'm suspecting that maybe we shouldn't be doing
Py_VISIT(od->od_weakreflist);
at all - best I can tell from a quick (non-exhaustive!) scan, the objects in the weakref list aren't incref'ed to begin with. And even if they were, that line would only be looking at the head of the list, ignoring all the non-head weakrefs after the head.
|
msg363177 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2020-03-02 14:43 |
Yes, I don't think other weakref-supporting objects traverse the weakreflist in their tp_traverse.
|
msg363189 - (view) |
Author: Pablo Galindo Salgado (pablogsal) * |
Date: 2020-03-02 15:54 |
I concur with Antoine and Tim. The GC already has the machinery to deal with weak references in the correct way (even more after recent bugfixes regarding callbacks). Traversing the weak reference list in incorrect because the object does not "own" the weak references to it, as the weak references can die even if the object is alive. Also, as Tim mentions, the traverse will be called on the head of the list, in the same way if you do object.__weakref__ you will only get the HEAD of the list:
>>> import weakref
>>> class A: ...
>>> a = A()
>>> w1 = weakref.ref(a)
>>> w2 = weakref.ref(a, lambda *args: None) # Use a callback to avoid re-using the original weakref
>>> id(w1)
4328697104
>>> id(w2)
4328758864
>>> id(a.__weakref__)
4328697104
I think that this is not very well documented, as there is no pointers on what should and should not be traversed in https://docs.python.org/3.8/c-api/typeobj.html#c.PyTypeObject.tp_traverse.
I will prepare a PR to the documentation if everybody agrees and another one removing the traverse unless someone sees that something else is at play.
|
msg363197 - (view) |
Author: Tim Peters (tim.peters) * |
Date: 2020-03-02 19:04 |
After some thought, I'm sure the diagnosis is correct: the weakref list must be made invisible to gc. That is, simply don't traverse it at all. The crash is exactly what's expected from traversing objects a container doesn't own references to.
I agree the tp_traverse docs should point out that weakref lists are special this way, but I think the problem is unique to them - can't think of another case where a container points to an object it doesn't own a reference to.
|
msg363198 - (view) |
Author: Pablo Galindo Salgado (pablogsal) * |
Date: 2020-03-02 19:10 |
> I agree the tp_traverse docs should point out that weakref lists are special this way, but I think the problem is unique to them - can't think of another case where a container points to an object it doesn't own a reference to.
Agreed, I was thinking a small warning or something because this is not the first time I see similar errors or users confused about what they should and should not track (with this I mention that we should say that "only objects that are *owned* should be tracked, and then the weakref warning or something similar). I will prepare a PR soon.
|
msg363214 - (view) |
Author: Pablo Galindo Salgado (pablogsal) * |
Date: 2020-03-02 23:12 |
New changeset 0c2b509f9d1d3a9065bc62c2407e1dc2ed70e9c2 by Pablo Galindo in branch 'master':
bpo-39778: Don't traverse weak-reference lists OrderedDict's tp_traverse and tp_clear (GH-18749)
https://github.com/python/cpython/commit/0c2b509f9d1d3a9065bc62c2407e1dc2ed70e9c2
|
msg363219 - (view) |
Author: miss-islington (miss-islington) |
Date: 2020-03-02 23:53 |
New changeset 69ded3944c202da972754644c0bbf7f77cc5e8ea by Miss Islington (bot) in branch '3.7':
bpo-39778: Don't traverse weak-reference lists OrderedDict's tp_traverse and tp_clear (GH-18749)
https://github.com/python/cpython/commit/69ded3944c202da972754644c0bbf7f77cc5e8ea
|
msg363220 - (view) |
Author: Pablo Galindo Salgado (pablogsal) * |
Date: 2020-03-02 23:55 |
New changeset 9ddcb914f9c2debe7c1359b2450cd1573e86b91c by Pablo Galindo in branch '3.8':
[3.8] bpo-39778: Don't traverse weak-reference lists OrderedDict's tp_traverse and tp_clear (GH-18749) (GH-18756)
https://github.com/python/cpython/commit/9ddcb914f9c2debe7c1359b2450cd1573e86b91c
|
msg363236 - (view) |
Author: miss-islington (miss-islington) |
Date: 2020-03-03 02:50 |
New changeset 6df421fe87a9418d6c59f89dbc5d5573b6826855 by Pablo Galindo in branch 'master':
bpo-39778: Add clarification about tp_traverse and ownership (GH-18754)
https://github.com/python/cpython/commit/6df421fe87a9418d6c59f89dbc5d5573b6826855
|
msg363237 - (view) |
Author: miss-islington (miss-islington) |
Date: 2020-03-03 03:04 |
New changeset 72fff60d7649df88026838d8b5f14f541393f268 by Miss Islington (bot) in branch '3.7':
bpo-39778: Add clarification about tp_traverse and ownership (GH-18754)
https://github.com/python/cpython/commit/72fff60d7649df88026838d8b5f14f541393f268
|
msg363238 - (view) |
Author: miss-islington (miss-islington) |
Date: 2020-03-03 03:04 |
New changeset 1827fc30f463786ebff13752e35c3224652bc94e by Miss Islington (bot) in branch '3.8':
bpo-39778: Add clarification about tp_traverse and ownership (GH-18754)
https://github.com/python/cpython/commit/1827fc30f463786ebff13752e35c3224652bc94e
|
msg363246 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-03-03 09:10 |
It seems like OrderedDict was the only type which visited weak references:
$ scm.py grep 'VISIT.*weak'
master: Grep 'VISIT.*weak' -- <4284 filenames>
==============================================
Objects/odictobject.c:1457: Py_VISIT(od->od_weakreflist);
|
msg402173 - (view) |
Author: Irit Katriel (iritkatriel) * |
Date: 2021-09-19 22:06 |
I could not reproduce the crash and from the discussion it seems resolved. Is there anything left here?
|
msg402181 - (view) |
Author: Pablo Galindo Salgado (pablogsal) * |
Date: 2021-09-19 23:05 |
> I could not reproduce the crash and from the discussion it seems resolved. Is there anything left here?
No, this should be fixed by now. I just forgot to close the issue. Thanks for the ping!
|
msg402191 - (view) |
Author: Raymond Hettinger (rhettinger) * |
Date: 2021-09-20 05:51 |
I'm thinking that edit to tp_dealloc was incorrect. The PyClear call should have been replaced (not removed) with the pattern shown in the C APIs docs ( https://docs.python.org/3/extending/newtypes.html?highlight=pyobject_clearw#weak-reference-support ):
if (self->weakreflist != NULL)
PyObject_ClearWeakRefs((PyObject *) self);
|
msg402192 - (view) |
Author: Raymond Hettinger (rhettinger) * |
Date: 2021-09-20 06:26 |
Also a test should be added to make sure the weakref callbacks are still occurring.
|
msg402221 - (view) |
Author: Pablo Galindo Salgado (pablogsal) * |
Date: 2021-09-20 10:44 |
> I'm thinking that edit to tp_dealloc was incorrect.
What edit are you referring to? PR 18749 only touches tp_clear and tp_traverse, not tp_dealloc
|
msg402222 - (view) |
Author: Pablo Galindo Salgado (pablogsal) * |
Date: 2021-09-20 10:45 |
Also, the ordered dict dealloc is already doing that:
https://github.com/python/cpython/blob/a856364cc920d8b16750fd1fadc902efb509754c/Objects/odictobject.c#L1372-L1373
|
msg402275 - (view) |
Author: Raymond Hettinger (rhettinger) * |
Date: 2021-09-20 19:40 |
False alarm. I misread the diff. All is well.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:59:27 | admin | set | github: 83959 |
2021-09-20 19:40:09 | rhettinger | set | status: open -> closed resolution: fixed messages:
+ msg402275
stage: patch review -> resolved |
2021-09-20 10:45:38 | pablogsal | set | messages:
+ msg402222 |
2021-09-20 10:44:55 | pablogsal | set | messages:
+ msg402221 |
2021-09-20 07:14:49 | vstinner | set | nosy:
- vstinner
|
2021-09-20 06:26:29 | rhettinger | set | messages:
+ msg402192 |
2021-09-20 05:51:14 | rhettinger | set | messages:
+ msg402191 |
2021-09-19 23:05:30 | pablogsal | set | messages:
+ msg402181 |
2021-09-19 22:06:30 | iritkatriel | set | nosy:
+ iritkatriel messages:
+ msg402173
|
2020-03-03 09:10:44 | vstinner | set | messages:
+ msg363246 |
2020-03-03 03:04:16 | miss-islington | set | messages:
+ msg363238 |
2020-03-03 03:04:04 | miss-islington | set | messages:
+ msg363237 |
2020-03-03 02:51:04 | miss-islington | set | pull_requests:
+ pull_request18118 |
2020-03-03 02:50:56 | miss-islington | set | pull_requests:
+ pull_request18117 |
2020-03-03 02:50:44 | miss-islington | set | messages:
+ msg363236 |
2020-03-02 23:55:23 | pablogsal | set | messages:
+ msg363220 |
2020-03-02 23:53:06 | miss-islington | set | messages:
+ msg363219 |
2020-03-02 23:36:48 | pablogsal | set | pull_requests:
+ pull_request18111 |
2020-03-02 23:36:18 | miss-islington | set | nosy:
+ miss-islington pull_requests:
+ pull_request18110
|
2020-03-02 23:31:01 | pablogsal | set | pull_requests:
+ pull_request18109 |
2020-03-02 23:12:57 | pablogsal | set | messages:
+ msg363214 |
2020-03-02 19:22:04 | pablogsal | set | keywords:
+ patch stage: needs patch -> patch review pull_requests:
+ pull_request18104 |
2020-03-02 19:10:36 | pablogsal | set | messages:
+ msg363198 |
2020-03-02 19:04:46 | tim.peters | set | messages:
+ msg363197 |
2020-03-02 15:54:33 | pablogsal | set | assignee: pablogsal |
2020-03-02 15:54:29 | pablogsal | set | nosy:
+ pablogsal messages:
+ msg363189
|
2020-03-02 15:05:46 | vstinner | set | nosy:
+ vstinner
|
2020-03-02 14:43:07 | pitrou | set | nosy:
+ pitrou messages:
+ msg363177
|
2020-02-28 05:53:29 | tim.peters | set | messages:
+ msg362866 |
2020-02-28 05:45:44 | rhettinger | set | nosy:
+ eric.snow
|
2020-02-28 03:38:48 | tim.peters | set | versions:
+ Python 3.7, Python 3.9 nosy:
+ rhettinger, tim.peters
messages:
+ msg362862
components:
+ Extension Modules, - C API stage: needs patch |
2020-02-27 22:06:11 | leezu | set | type: crash |
2020-02-27 21:52:09 | leezu | create | |