Issue34042
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2018-07-04 11:42 by kayhayen, last changed 2022-04-11 14:59 by admin. This issue is now closed.
Pull Requests | |||
---|---|---|---|
URL | Status | Linked | Edit |
PR 8119 | merged | yselivanov, 2018-07-05 19:52 | |
PR 8143 | merged | miss-islington, 2018-07-06 16:21 |
Messages (16) | |||
---|---|---|---|
msg321037 - (view) | Author: Kay Hayen (kayhayen) | Date: 2018-07-04 11:42 | |
I have a test in Nuitka, designed to detect reference counting problems with Python code. It basically takes a snapshot of the refcount, runs a function until it stabilizes, then says PASS, or else reports the last diff. Obviously for CPython it's supposed to pass. Testing with self compiled 3.7.0 and 3.7.0-1 as in Debian testing (buster) currently, this happens for me: This is a cutout, there are more than 100 functions, I am listing the ones that report: simpleFunction16: FAILED 118414 118412 leaked -2 simpleFunction17: FAILED 118395 118393 leaked -2 simpleFunction18: FAILED 118375 118373 leaked -2 ... simpleFunction21: FAILED 118337 118333 leaked -4 ... simpleFunction25: FAILED 118315 118313 leaked -2 simpleFunction26: FAILED 118295 118293 leaked -2 ... simpleFunction38: FAILED 118257 118253 leaked -4 simpleFunction39: FAILED 118235 118233 leaked -2 ... simpleFunction43: FAILED 118215 118213 leaked -2 simpleFunction48: FAILED 118195 118193 leaked -2 ... simpleFunction76: FAILED 118422 118418 leaked -4 ... simpleFunction88: FAILED 118400 118398 leaked -2 This is really bad, because normally values are positive, merely preventing a release. A negative value indicates that references have been lost. This will normally result in corruption of the memory allocator for Python, although I have not yet seen that. I have a few remaining cases, where compiled code causes negative leaks too, there is happens. But I didn't force the issue yet. Notice that this is of course with the debug Python version and let me express, pure CPython is used to run the test code. When compiling with Nuitka and 3.7, my private code gives the same ref counts for there, but it also pretty much does the same things, these are some of the functions: def simpleFunction16(): class EmptyClass: pass return EmptyClass def simpleFunction39(): class Parent(object): pass All the other cases also use locally defined classes and only test cases using local classes are failing for me here. To reproduce is easy: git clone --branch develop http://git.nuitka.net/Nuitka.git python3.7-dbg Nuitka/tests/basics/Referencing.py Thanks, Kay Hayen |
|||
msg321043 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * ![]() |
Date: 2018-07-04 13:29 | |
The simplest reproducer: import gc, sys for i in range(100): class A: pass del A c = gc.collect() del c print(sys.gettotalrefcount()) |
|||
msg321090 - (view) | Author: Kay Hayen (kayhayen) | Date: 2018-07-05 08:55 | |
Just to confirm, this is also happening on Windows as well, with both 32 and 64 bits. |
|||
msg321104 - (view) | Author: Pablo Galindo Salgado (pablogsal) * ![]() |
Date: 2018-07-05 14:50 | |
I ran git bisect with @Serhiy reproducer and it yielded commit 3929499914d47365ae744df312e16da8955c90ac as the first bad commit that has this problem. I did not look more into that but the bisect is reproducible (yields the same commit if you run it several times). |
|||
msg321120 - (view) | Author: Pablo Galindo Salgado (pablogsal) * ![]() |
Date: 2018-07-05 17:15 | |
I investigated that commit but I cannot find anything :(. |
|||
msg321122 - (view) | Author: Pablo Galindo Salgado (pablogsal) * ![]() |
Date: 2018-07-05 18:03 | |
I think I found it. After doing the bisect manually (as redirecting stdout/stderr) seems to affect somehow the test) I found commit b0a7a037b8fde56b62f886d5188bced7776777b4 as the culprit. Reverting this commit on the current master seem to solve the problem. This is the diff of the revert: diff --git a/Objects/dictobject.c b/Objects/dictobject.c index 40d7d8af6e..f7d792232e 100644 --- a/Objects/dictobject.c +++ b/Objects/dictobject.c @@ -613,52 +613,6 @@ new_dict_with_shared_keys(PyDictKeysObject *keys) return new_dict(keys, values); } - -static PyObject * -clone_combined_dict(PyDictObject *orig) -{ - assert(PyDict_CheckExact(orig)); - assert(orig->ma_values == NULL); - assert(orig->ma_keys->dk_refcnt == 1); - - Py_ssize_t keys_size = _PyDict_KeysSize(orig->ma_keys); - PyDictKeysObject *keys = PyObject_Malloc(keys_size); - if (keys == NULL) { - PyErr_NoMemory(); - return NULL; - } - - memcpy(keys, orig->ma_keys, keys_size); - - /* After copying key/value pairs, we need to incref all - keys and values and they are about to be co-owned by a - new dict object. */ - PyDictKeyEntry *ep0 = DK_ENTRIES(keys); - Py_ssize_t n = keys->dk_nentries; - for (Py_ssize_t i = 0; i < n; i++) { - PyDictKeyEntry *entry = &ep0[i]; - PyObject *value = entry->me_value; - if (value != NULL) { - Py_INCREF(value); - Py_INCREF(entry->me_key); - } - } - - PyDictObject *new = (PyDictObject *)new_dict(keys, NULL); - if (new == NULL) { - /* In case of an error, `new_dict()` takes care of - cleaning up `keys`. */ - return NULL; - } - new->ma_used = orig->ma_used; - assert(_PyDict_CheckConsistency(new)); - if (_PyObject_GC_IS_TRACKED(orig)) { - /* Maintain tracking. */ - _PyObject_GC_TRACK(new); - } - return (PyObject *)new; -} - PyObject * PyDict_New(void) { @@ -2527,13 +2481,7 @@ PyDict_Copy(PyObject *o) PyErr_BadInternalCall(); return NULL; } - mp = (PyDictObject *)o; - if (mp->ma_used == 0) { - /* The dict is empty; just return a new dict. */ - return PyDict_New(); - } - if (_PyDict_HasSplitTable(mp)) { PyDictObject *split_copy; Py_ssize_t size = USABLE_FRACTION(DK_SIZE(mp->ma_keys)); @@ -2560,27 +2508,6 @@ PyDict_Copy(PyObject *o) _PyObject_GC_TRACK(split_copy); return (PyObject *)split_copy; } - - if (PyDict_CheckExact(mp) && mp->ma_values == NULL && - (mp->ma_used >= (mp->ma_keys->dk_nentries * 2) / 3)) - { - /* Use fast-copy if: - - (1) 'mp' is an instance of a subclassed dict; and - - (2) 'mp' is not a split-dict; and - - (3) if 'mp' is non-compact ('del' operation does not resize dicts), - do fast-copy only if it has at most 1/3 non-used keys. - - The last condition (3) is important to guard against a pathological - case when a large dict is almost emptied with multiple del/pop - operations and copied after that. In cases like this, we defer to - PyDict_Merge, which produces a compacted copy. - */ - return clone_combined_dict(mp); - } - copy = PyDict_New(); if (copy == NULL) return NULL; |
|||
msg321123 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2018-07-05 18:04 | |
I've bisected myself and the culprit seems to be b0a7a037b8fde56b62f886d5188bced7776777b4 ("""bpo-31179: Make dict.copy() up to 5.5 times faster."""). |
|||
msg321124 - (view) | Author: Pablo Galindo Salgado (pablogsal) * ![]() |
Date: 2018-07-05 18:07 | |
@Antoine Wow, it seems that we both foundb0a7a037b8fde56b62f886d5188bced7776777b4 with one minute of difference :D I added Yuri to the noise list as is the author of that PR. |
|||
msg321128 - (view) | Author: Yury Selivanov (yselivanov) * ![]() |
Date: 2018-07-05 19:43 | |
This isn't a real reference bug, but rather a bug in total refs accountability. It seems that I missed the fact that we track refs to the keys table with a DK_INCREF macro. The new `clone_combined_dict` uses `memcpy` to clone the keys table (along with its `dk_refcnt` field, but it doesn't register the fact that we have a new keys table after copying. The bug can be solved with the following diff: diff --git a/Objects/dictobject.c b/Objects/dictobject.c index 7a1bcebec6..3ac6a54415 100644 --- a/Objects/dictobject.c +++ b/Objects/dictobject.c @@ -656,6 +656,7 @@ clone_combined_dict(PyDictObject *orig) /* Maintain tracking. */ _PyObject_GC_TRACK(new); } + _Py_INC_REFTOTAL; return (PyObject *)new; } I don't think this is a critical-level bug that needs an emergency 3.7.1 release, but I'll submit a PR right now. Would appreciate if you guys can review it. |
|||
msg321129 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2018-07-05 19:44 | |
Agreed, if it was a real reference bug, the interpreter should crash before the total reference count gets negative ;-) |
|||
msg321130 - (view) | Author: Yury Selivanov (yselivanov) * ![]() |
Date: 2018-07-05 19:54 | |
> Agreed, if it was a real reference bug, the interpreter should crash before the total reference count gets negative ;-) First thing I checked with Serhiy's script :) A PR with a fix: https://github.com/python/cpython/pull/8119 |
|||
msg321159 - (view) | Author: Kay Hayen (kayhayen) | Date: 2018-07-06 09:03 | |
Hello, so it's harmless and it explains the other reference counting issue, where a change in call convention could make a reference counting bug show or go away: codecs.open(TESTFN, encoding='cp949') This was showing it, where as codecs.open(TESTFN, "rb", 'cp949') was not. With all the difference being a dictionary passed into a function call or not. I am now assuming that getting your fix this will also go away. Caused me a bit of head scratching already. If you could do what you often you, and make this what distributions like Debian pull from, it would be good enough in terms of release for me, as it blocks Nuitka tests from passing on them. Yours, Kay |
|||
msg321178 - (view) | Author: Yury Selivanov (yselivanov) * ![]() |
Date: 2018-07-06 16:03 | |
> If you could do what you often you, and make this what distributions like Debian pull from, it would be good enough in terms of release for me, as it blocks Nuitka tests from passing on them. I'm not sure what you mean by "what you often do". I'll push a fix to master and 3.7 branches in a few minutes, so perhaps you can open a PR to ask Linux vendors to add a patch for 3.7.0 to their packages. An easier solution would be for you to disable refcounting tests that rely on gettotalrefcount just for 3.7.0. And I apologize for the hassle :( |
|||
msg321179 - (view) | Author: Yury Selivanov (yselivanov) * ![]() |
Date: 2018-07-06 16:20 | |
New changeset 0b75228700e0077d8bf2636e74733389514b4b2f by Yury Selivanov in branch 'master': bpo-34042: Fix dict.copy() to maintain correct total refcount (GH-8119) https://github.com/python/cpython/commit/0b75228700e0077d8bf2636e74733389514b4b2f |
|||
msg321182 - (view) | Author: miss-islington (miss-islington) | Date: 2018-07-06 16:40 | |
New changeset 127bd9bfd591c8ec1a97eb7f4037c8b884eef973 by Miss Islington (bot) in branch '3.7': bpo-34042: Fix dict.copy() to maintain correct total refcount (GH-8119) https://github.com/python/cpython/commit/127bd9bfd591c8ec1a97eb7f4037c8b884eef973 |
|||
msg321183 - (view) | Author: Yury Selivanov (yselivanov) * ![]() |
Date: 2018-07-06 16:41 | |
Thank you for reporting the issue, Kay. And huge thanks to Antoine and Pablo for bisecting. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:59:02 | admin | set | github: 78223 |
2018-07-06 16:41:53 | yselivanov | set | status: open -> closed resolution: fixed messages: + msg321183 stage: patch review -> resolved |
2018-07-06 16:40:19 | miss-islington | set | nosy:
+ miss-islington messages: + msg321182 |
2018-07-06 16:21:26 | miss-islington | set | pull_requests: + pull_request7715 |
2018-07-06 16:20:11 | yselivanov | set | messages: + msg321179 |
2018-07-06 16:03:58 | yselivanov | set | messages: + msg321178 |
2018-07-06 09:03:44 | kayhayen | set | messages: + msg321159 |
2018-07-05 19:54:19 | yselivanov | set | messages: + msg321130 |
2018-07-05 19:52:29 | yselivanov | set | keywords:
+ patch stage: patch review pull_requests: + pull_request7704 |
2018-07-05 19:44:53 | pitrou | set | messages: + msg321129 |
2018-07-05 19:43:52 | yselivanov | set | messages: + msg321128 |
2018-07-05 18:07:02 | pablogsal | set | messages: + msg321124 |
2018-07-05 18:04:26 | pitrou | set | nosy:
+ yselivanov messages: + msg321123 |
2018-07-05 18:03:58 | pablogsal | set | messages: + msg321122 |
2018-07-05 17:15:55 | pablogsal | set | messages: + msg321120 |
2018-07-05 14:50:12 | pablogsal | set | nosy:
+ pablogsal messages: + msg321104 |
2018-07-05 08:55:09 | kayhayen | set | messages: + msg321090 |
2018-07-04 13:29:17 | serhiy.storchaka | set | nosy:
+ pitrou, vstinner messages: + msg321043 |
2018-07-04 13:07:28 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka type: resource usage components: + Interpreter Core |
2018-07-04 11:42:51 | kayhayen | create |