-
-
Notifications
You must be signed in to change notification settings - Fork 29.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test test_multibytecodec: Test_IncrementalEncoder.test_subinterp() leaks references #87032
Comments
$ ./python -m test test_multibytecodec -m test.test_multibytecodec.Test_IncrementalEncoder.test_subinterp -R 3:3
(...)
test_multibytecodec leaked [258, 258, 258] references, sum=774 I simplified the code. The following test leaks references: def test_subinterp(self):
import _testcapi
code = textwrap.dedent("""
import _codecs_jp
codec = _codecs_jp.getcodec('cp932')
codec = None
""")
_testcapi.run_in_subinterp(code) _codecs_jp.getcodec() is defined in Modules/cjkcodecs/cjkcodecs.h. Extract: cofunc = getmultibytecodec();
...
codecobj = PyCapsule_New((void *)codec, PyMultibyteCodec_CAPSULE_NAME, NULL);
if (codecobj == NULL)
return NULL; r = PyObject_CallOneArg(cofunc, codecobj);
Py_DECREF(codecobj); getmultibytecodec() is the _multibytecodec.__create_codec() which is defined in Modules/cjkcodecs/multibytecodec.c. Simplified code: codec = PyCapsule_GetPointer(arg, PyMultibyteCodec_CAPSULE_NAME);
_multibytecodec_state *state = _multibytecodec_get_state(module);
self = PyObject_New(MultibyteCodecObject, state->multibytecodec_type);
self->codec = codec;
return (PyObject *)self; |
I added the test which leaks in bpo-42846. commit 07f2cee
I don't think that the leak is new. It's just that it wasn't seen previously. |
By the way, I wrote an article "Leaks discovered by subinterpreters": This leak may be new kind related to capsule, I'm not sure so far. |
Thank you so much for taking the time to write these blog posts, Victor, and for explaining your fixes is such great detail. It is very helpful! |
PR 24165 fix one reference leak in the getcodec() function of CJK codecs. But it doesn't fix all ref leaks of this issue. |
The leak can be simplified to: @support.cpython_only
def test_subinterp(self):
import _testcapi
code = textwrap.dedent("""
import encodings
import _codecs_jp
encodings._cache['cp932'] = _codecs_jp.getcodec('cp932')
""")
res = _testcapi.run_in_subinterp(code)
self.assertEqual(res, 0) |
Links about ref leaks related to encodings: |
For example, unregistering the search function prevents the leak: import encodings
import _codecs_jp
encodings._cache['cp932'] = _codecs_jp.getcodec('cp932')
import codecs
codecs.unregister(encodings.search_function) The PyInterpreterState.codec_search_path list is cleared at Python exit by interpreter_clear(). The weird part is that the _codecs_jp.getcodec('cp932') codec object *is* deleted. I checked and multibytecodec_dealloc() is called with the object stored in the encodings cache. A _multibytecodec.MultibyteCodec instance (MultibyteCodecObject* structure in C) is a simple type: it only stores pointer to C functions and C strings. It doesn't contain any Python object. So I don't see how it could be part of a reference cycle by itself. Moreover, again, it is deleted. |
Calling gc.collect() twice works around the issue, which sounds like a missing traverse function somewhere. diff --git a/Python/pystate.c b/Python/pystate.c
index c791b23999..66bbe1bf7d 100644
--- a/Python/pystate.c
+++ b/Python/pystate.c
@@ -321,6 +321,7 @@ interpreter_clear(PyInterpreterState *interp, PyThreadState *tstate)
/* Last garbage collection on this interpreter */
_PyGC_CollectNoFail(tstate);
+ _PyGC_CollectNoFail(tstate);
_PyGC_Fini(tstate);
/* We don't clear sysdict and builtins until the end of this function. |
Don't you need to free memory using PyObject_GC_Del when you allocate using PyObject_GC_New? |
My PR uses the generic tp->tp_free which PyObject_GC_Del(). |
typo: My PR uses the generic tp->tp_free which *is* PyObject_GC_Del(). |
Ah, thanks! I also found that info in the docs: https://docs.python.org/3/c-api/typeobj.html#c.PyTypeObject.tp_free |
Ok, it's now fixed. To make sure that a heap type can be deleted, it must by a GC type, has a traverse function, and it must track its instances. We should take this in account when we convert static types to heap types. In practice, deleting a type is mostly an issue when we check for reference leak and an instance is kept alive until the last GC collection, at the end of Py_EndInterpreter(). |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: