classification
Title: collections.OrderedDict and weakref.ref raises "refcount is too small" assertion
Type: crash Stage: patch review
Components: Extension Modules Versions: Python 3.9, Python 3.8, Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: pablogsal Nosy List: eric.snow, leezu, miss-islington, pablogsal, pitrou, rhettinger, tim.peters, vstinner
Priority: normal Keywords: patch

Created on 2020-02-27 21:52 by leezu, last changed 2020-03-03 09:10 by vstinner.

Pull Requests
URL Status Linked Edit
PR 18749 merged pablogsal, 2020-03-02 19:22
PR 18754 merged pablogsal, 2020-03-02 23:31
PR 18755 merged miss-islington, 2020-03-02 23:36
PR 18756 merged pablogsal, 2020-03-02 23:36
PR 18762 merged miss-islington, 2020-03-03 02:50
PR 18763 merged miss-islington, 2020-03-03 02:51
Messages (14)
msg362846 - (view) Author: Leonard Lausen (leezu) Date: 2020-02-27 21:52
Below sample program, will raise "Modules/gcmodule.c:110: gc_decref: Assertion "gc_get_refs(g) > 0" failed: refcount is too small" on Python 3.8.2 debug build.
On 3.7.6 debug build, "Modules/gcmodule.c:277: visit_decref: Assertion `_PyGCHead_REFS(gc) != 0' failed." is raised.

```
import collections
import gc
import weakref

hooks_dict = collections.OrderedDict()
hooks_dict_ref = weakref.ref(hooks_dict)
gc.collect()

print('Hello world')
```


The complete error message on 3.8.2 debug build is

```
Modules/gcmodule.c:110: gc_decref: Assertion "gc_get_refs(g) > 0" failed: refcount is too small
Memory block allocated at (most recent call first):
  File "/home/$USER/test.py", line 6

object  : <weakref at 0x7ff788208a70; to 'collections.OrderedDict' at 0x7ff7881fab90>
type    : weakref
refcount: 1
address : 0x7ff788208a70
Fatal Python error: _PyObject_AssertFailed
Python runtime state: initialized

Current thread 0x00007ff789f9c080 (most recent call first):
  File "/home/$USER/test.py", line 7 in <module>
zsh: abort      PYTHONTRACEMALLOC=1 python ~/test.py
```
msg362862 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2020-02-28 03:38
Thanks for the succinct example!  Although you don't need the print() ;-)

I can confirm this crashes the same way under a current master build (albeit on Windows 64-bit).

gc is calculating how many references in the current generation are accounted for by intra-generation references, and gc's visit_decref() is getting called back by odictobject.c's odict_traverse(), on line

    Py_VISIT(od->od_weakreflist);

gc has found a second pointer to od->od_weakreflist, which is more than its refcount (1) claims exist.

Python's weakref implementation gives me headaches, so I'm adding Raymond to the nosy list under the hope the problem will be obvious to him.
msg362866 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2020-02-28 05:53
I'm suspecting that maybe we shouldn't be doing

    Py_VISIT(od->od_weakreflist);

at all - best I can tell from a quick (non-exhaustive!) scan, the objects in the weakref list aren't incref'ed to begin with.  And even if they were, that line would only be looking at the head of the list, ignoring all the non-head weakrefs after the head.
msg363177 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2020-03-02 14:43
Yes, I don't think other weakref-supporting objects traverse the weakreflist in their tp_traverse.
msg363189 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-03-02 15:54
I concur with Antoine and Tim. The GC already has the machinery to deal with weak references in the correct way (even more after recent bugfixes regarding callbacks). Traversing the weak reference list in incorrect because the object does not "own" the weak references to it, as the weak references can die even if the object is alive. Also, as Tim mentions, the traverse will be called on the head of the list, in the same way if you do object.__weakref__ you will only get the HEAD of the list:

>>> import weakref
>>> class A: ...
>>> a = A()
>>> w1 = weakref.ref(a)
>>> w2 = weakref.ref(a, lambda *args: None) # Use a callback to avoid re-using the original weakref
>>> id(w1)
4328697104
>>> id(w2)
4328758864
>>> id(a.__weakref__)
4328697104

I think that this is not very well documented, as there is no pointers on what should and should not be traversed in https://docs.python.org/3.8/c-api/typeobj.html#c.PyTypeObject.tp_traverse.

I will prepare a PR to the documentation if everybody agrees and another one removing the traverse unless someone sees that something else is at play.
msg363197 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2020-03-02 19:04
After some thought, I'm sure the diagnosis is correct:  the weakref list must be made invisible to gc.  That is, simply don't traverse it at all.  The crash is exactly what's expected from traversing objects a container doesn't own references to.

I agree the tp_traverse docs should point out that weakref lists are special this way, but I think the problem is unique to them - can't think of another case where a container points to an object it doesn't own a reference to.
msg363198 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-03-02 19:10
> I agree the tp_traverse docs should point out that weakref lists are special this way, but I think the problem is unique to them - can't think of another case where a container points to an object it doesn't own a reference to.

Agreed, I was thinking a small warning or something because this is not the first time I see similar errors or users confused about what they should and should not track (with this I mention that we should say that "only objects that are *owned* should be tracked, and then the weakref warning or something similar). I will prepare a PR soon.
msg363214 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-03-02 23:12
New changeset 0c2b509f9d1d3a9065bc62c2407e1dc2ed70e9c2 by Pablo Galindo in branch 'master':
bpo-39778: Don't traverse weak-reference lists OrderedDict's tp_traverse and tp_clear (GH-18749)
https://github.com/python/cpython/commit/0c2b509f9d1d3a9065bc62c2407e1dc2ed70e9c2
msg363219 - (view) Author: miss-islington (miss-islington) Date: 2020-03-02 23:53
New changeset 69ded3944c202da972754644c0bbf7f77cc5e8ea by Miss Islington (bot) in branch '3.7':
bpo-39778: Don't traverse weak-reference lists OrderedDict's tp_traverse and tp_clear (GH-18749)
https://github.com/python/cpython/commit/69ded3944c202da972754644c0bbf7f77cc5e8ea
msg363220 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-03-02 23:55
New changeset 9ddcb914f9c2debe7c1359b2450cd1573e86b91c by Pablo Galindo in branch '3.8':
[3.8] bpo-39778: Don't traverse weak-reference lists OrderedDict's tp_traverse and tp_clear (GH-18749) (GH-18756)
https://github.com/python/cpython/commit/9ddcb914f9c2debe7c1359b2450cd1573e86b91c
msg363236 - (view) Author: miss-islington (miss-islington) Date: 2020-03-03 02:50
New changeset 6df421fe87a9418d6c59f89dbc5d5573b6826855 by Pablo Galindo in branch 'master':
bpo-39778: Add clarification about tp_traverse and ownership (GH-18754)
https://github.com/python/cpython/commit/6df421fe87a9418d6c59f89dbc5d5573b6826855
msg363237 - (view) Author: miss-islington (miss-islington) Date: 2020-03-03 03:04
New changeset 72fff60d7649df88026838d8b5f14f541393f268 by Miss Islington (bot) in branch '3.7':
bpo-39778: Add clarification about tp_traverse and ownership (GH-18754)
https://github.com/python/cpython/commit/72fff60d7649df88026838d8b5f14f541393f268
msg363238 - (view) Author: miss-islington (miss-islington) Date: 2020-03-03 03:04
New changeset 1827fc30f463786ebff13752e35c3224652bc94e by Miss Islington (bot) in branch '3.8':
bpo-39778: Add clarification about tp_traverse and ownership (GH-18754)
https://github.com/python/cpython/commit/1827fc30f463786ebff13752e35c3224652bc94e
msg363246 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-03 09:10
It seems like OrderedDict was the only type which visited weak references:

$ scm.py grep 'VISIT.*weak'
master: Grep 'VISIT.*weak' -- <4284 filenames>
==============================================

Objects/odictobject.c:1457:    Py_VISIT(od->od_weakreflist);
History
Date User Action Args
2020-03-03 09:10:44vstinnersetmessages: + msg363246
2020-03-03 03:04:16miss-islingtonsetmessages: + msg363238
2020-03-03 03:04:04miss-islingtonsetmessages: + msg363237
2020-03-03 02:51:04miss-islingtonsetpull_requests: + pull_request18118
2020-03-03 02:50:56miss-islingtonsetpull_requests: + pull_request18117
2020-03-03 02:50:44miss-islingtonsetmessages: + msg363236
2020-03-02 23:55:23pablogsalsetmessages: + msg363220
2020-03-02 23:53:06miss-islingtonsetmessages: + msg363219
2020-03-02 23:36:48pablogsalsetpull_requests: + pull_request18111
2020-03-02 23:36:18miss-islingtonsetnosy: + miss-islington
pull_requests: + pull_request18110
2020-03-02 23:31:01pablogsalsetpull_requests: + pull_request18109
2020-03-02 23:12:57pablogsalsetmessages: + msg363214
2020-03-02 19:22:04pablogsalsetkeywords: + patch
stage: needs patch -> patch review
pull_requests: + pull_request18104
2020-03-02 19:10:36pablogsalsetmessages: + msg363198
2020-03-02 19:04:46tim.peterssetmessages: + msg363197
2020-03-02 15:54:33pablogsalsetassignee: pablogsal
2020-03-02 15:54:29pablogsalsetnosy: + pablogsal
messages: + msg363189
2020-03-02 15:05:46vstinnersetnosy: + vstinner
2020-03-02 14:43:07pitrousetnosy: + pitrou
messages: + msg363177
2020-02-28 05:53:29tim.peterssetmessages: + msg362866
2020-02-28 05:45:44rhettingersetnosy: + eric.snow
2020-02-28 03:38:48tim.peterssetversions: + Python 3.7, Python 3.9
nosy: + rhettinger, tim.peters

messages: + msg362862

components: + Extension Modules, - C API
stage: needs patch
2020-02-27 22:06:11leezusettype: crash
2020-02-27 21:52:09leezucreate