classification
Title: test test_multibytecodec: Test_IncrementalEncoder.test_subinterp() leaks references
Type: Stage: resolved
Components: Library (Lib) Versions: Python 3.10
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: corona10, erlendaasland, miss-islington, vstinner
Priority: normal Keywords: patch

Created on 2021-01-08 09:28 by vstinner, last changed 2021-01-08 14:49 by vstinner. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 24165 merged vstinner, 2021-01-08 09:58
PR 24166 merged vstinner, 2021-01-08 14:02
PR 24167 closed miss-islington, 2021-01-08 14:24
PR 24168 closed miss-islington, 2021-01-08 14:26
Messages (18)
msg384643 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-01-08 09:28
$ ./python -m test test_multibytecodec -m test.test_multibytecodec.Test_IncrementalEncoder.test_subinterp -R 3:3
(...)
test_multibytecodec leaked [258, 258, 258] references, sum=774

I simplified the code. The following test leaks references:

    def test_subinterp(self):
        import _testcapi
        code = textwrap.dedent("""
            import _codecs_jp
            codec = _codecs_jp.getcodec('cp932')
            codec = None
        """)
        _testcapi.run_in_subinterp(code)

_codecs_jp.getcodec() is defined in Modules/cjkcodecs/cjkcodecs.h. Extract:

    cofunc = getmultibytecodec();
    ...
    codecobj = PyCapsule_New((void *)codec, PyMultibyteCodec_CAPSULE_NAME, NULL);
    if (codecobj == NULL)
        return NULL;

    r = PyObject_CallOneArg(cofunc, codecobj);
    Py_DECREF(codecobj);

getmultibytecodec() is the _multibytecodec.__create_codec() which is defined in Modules/cjkcodecs/multibytecodec.c. Simplified code:

    codec = PyCapsule_GetPointer(arg, PyMultibyteCodec_CAPSULE_NAME);
    _multibytecodec_state *state = _multibytecodec_get_state(module);
    self = PyObject_New(MultibyteCodecObject, state->multibytecodec_type);
    self->codec = codec;
    return (PyObject *)self;
msg384645 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-01-08 09:31
I added the test which leaks in bpo-42846.

commit 07f2cee93f1b619650403981c455f47bfed8d818
Author: Victor Stinner <vstinner@python.org>
Date:   Fri Jan 8 00:15:22 2021 +0100

    bpo-42846: Convert CJK codec extensions to multiphase init (GH-24157)
    
    Convert the 6 CJK codec extension modules (_codecs_cn, _codecs_hk,
    _codecs_iso2022, _codecs_jp, _codecs_kr and _codecs_tw) to the
    multiphase initialization API (PEP 489).
    
    Remove getmultibytecodec() local cache: always import
    _multibytecodec. It should be uncommon to get a codec. For example,
    this function is only called once per CJK codec module.
    
    Fix a reference leak in register_maps() error path.

I don't think that the leak is new. It's just that it wasn't seen previously.
msg384647 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-01-08 09:34
By the way, I wrote an article "Leaks discovered by subinterpreters":
https://vstinner.github.io/subinterpreter-leaks.html

This leak may be new kind related to capsule, I'm not sure so far.
msg384648 - (view) Author: Erlend Egeberg Aasland (erlendaasland) * Date: 2021-01-08 09:44
Thank you so much for taking the time to write these blog posts, Victor, and for explaining your fixes is such great detail. It is very helpful!
msg384650 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-01-08 09:59
PR 24165 fix one reference leak in the getcodec() function of CJK codecs. But it doesn't fix all ref leaks of this issue.
msg384651 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-01-08 10:24
The leak can be simplified to:

    @support.cpython_only
    def test_subinterp(self):
        import _testcapi
        code = textwrap.dedent("""
            import encodings
            import _codecs_jp
            encodings._cache['cp932'] = _codecs_jp.getcodec('cp932')
        """)
        res = _testcapi.run_in_subinterp(code)
        self.assertEqual(res, 0)
msg384652 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-01-08 10:29
Links about ref leaks related to encodings:

* https://bugs.python.org/issue1635741#msg364833 
* https://bugs.python.org/issue1635741#msg364968
* https://bugs.python.org/issue36854#msg357160
* https://github.com/python/cpython/compare/master...phsilva:remove-codec-caches
msg384653 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-01-08 12:00
See also bpo-42671 "Make the Python finalization more deterministic" but it seems like PR 23826 makes the leak worse (+2000 leaked references instead of +200) :-)
msg384654 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-01-08 12:23
> encodings._cache['cp932'] = _codecs_jp.getcodec('cp932')

* encodings._cache is kept alive by encodings.search_function.__globals__
* encodings.search_function function is kept alive by PyInterpreterState.codec_search_path list. The function by _PyCodec_Register() in encodings/__init__.py: codecs.register(search_function).

For example, unregistering the search function prevents the leak:

            import encodings
            import _codecs_jp
            encodings._cache['cp932'] = _codecs_jp.getcodec('cp932')

            import codecs
            codecs.unregister(encodings.search_function)

The PyInterpreterState.codec_search_path list is cleared at Python exit by interpreter_clear().

The weird part is that the _codecs_jp.getcodec('cp932') codec object *is* deleted. I checked and multibytecodec_dealloc() is called with the object stored in the encodings cache.

A _multibytecodec.MultibyteCodec instance (MultibyteCodecObject* structure in C) is a simple type: it only stores pointer to C functions and C strings. It doesn't contain any Python object. So I don't see how it could be part of a reference cycle by itself. Moreover, again, it is deleted.
msg384661 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-01-08 13:59
Calling gc.collect() twice works around the issue, which sounds like a missing traverse function somewhere.

diff --git a/Python/pystate.c b/Python/pystate.c
index c791b23999..66bbe1bf7d 100644
--- a/Python/pystate.c
+++ b/Python/pystate.c
@@ -321,6 +321,7 @@ interpreter_clear(PyInterpreterState *interp, PyThreadState *tstate)
 
     /* Last garbage collection on this interpreter */
     _PyGC_CollectNoFail(tstate);
+    _PyGC_CollectNoFail(tstate);
     _PyGC_Fini(tstate);
 
     /* We don't clear sysdict and builtins until the end of this function.
msg384662 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-01-08 14:01
New changeset e542d417b96077d04aec089505eacb990c9799ae by Victor Stinner in branch 'master':
bpo-42866: Fix refleak in CJK getcodec() (GH-24165)
https://github.com/python/cpython/commit/e542d417b96077d04aec089505eacb990c9799ae
msg384665 - (view) Author: Erlend Egeberg Aasland (erlendaasland) * Date: 2021-01-08 14:12
Don't you need to free memory using PyObject_GC_Del when you allocate using PyObject_GC_New?
msg384667 - (view) Author: Erlend Egeberg Aasland (erlendaasland) * Date: 2021-01-08 14:14
Ref. https://docs.python.org/3/c-api/typeobj.html#Py_TPFLAGS_HAVE_GC
msg384669 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-01-08 14:23
> Don't you need to free memory using PyObject_GC_Del when you allocate using PyObject_GC_New?

My PR uses the generic tp->tp_free which PyObject_GC_Del().
msg384670 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-01-08 14:23
typo: My PR uses the generic tp->tp_free which *is* PyObject_GC_Del().
msg384671 - (view) Author: Erlend Egeberg Aasland (erlendaasland) * Date: 2021-01-08 14:29
Ah, thanks! I also found that info in the docs: https://docs.python.org/3/c-api/typeobj.html#c.PyTypeObject.tp_free
msg384673 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-01-08 14:44
New changeset 11ef53aefbecfac18b63cee518a7184f771f708c by Victor Stinner in branch 'master':
bpo-42866: Add traverse func to _multibytecodec.MultibyteCodec (GH-24166)
https://github.com/python/cpython/commit/11ef53aefbecfac18b63cee518a7184f771f708c
msg384675 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-01-08 14:49
Ok, it's now fixed.

To make sure that a heap type can be deleted, it must by a GC type, has a traverse function, and it must track its instances. We should take this in account when we convert static types to heap types. In practice, deleting a type is mostly an issue when we check for reference leak and an instance is kept alive until the last GC collection, at the end of Py_EndInterpreter().
History
Date User Action Args
2021-01-08 14:49:29vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg384675

stage: patch review -> resolved
2021-01-08 14:44:10vstinnersetmessages: + msg384673
2021-01-08 14:29:01erlendaaslandsetmessages: + msg384671
2021-01-08 14:26:04miss-islingtonsetpull_requests: + pull_request22996
2021-01-08 14:24:52miss-islingtonsetnosy: + miss-islington
pull_requests: + pull_request22995
2021-01-08 14:23:42vstinnersetmessages: + msg384670
2021-01-08 14:23:23vstinnersetmessages: + msg384669
2021-01-08 14:14:03erlendaaslandsetmessages: + msg384667
2021-01-08 14:12:48erlendaaslandsetmessages: + msg384665
2021-01-08 14:02:53vstinnersetpull_requests: + pull_request22994
2021-01-08 14:01:49vstinnersetmessages: + msg384662
2021-01-08 13:59:38vstinnersetmessages: + msg384661
2021-01-08 12:23:51vstinnersetmessages: + msg384654
2021-01-08 12:00:06vstinnersetmessages: + msg384653
2021-01-08 10:29:30vstinnersetmessages: + msg384652
2021-01-08 10:24:16vstinnersetmessages: + msg384651
2021-01-08 09:59:16vstinnersetmessages: + msg384650
2021-01-08 09:58:17vstinnersetkeywords: + patch
stage: patch review
pull_requests: + pull_request22993
2021-01-08 09:44:39erlendaaslandsetmessages: + msg384648
2021-01-08 09:34:30vstinnersetmessages: + msg384647
2021-01-08 09:32:50vstinnersetnosy: + corona10, erlendaasland
2021-01-08 09:31:03vstinnersetmessages: + msg384645
2021-01-08 09:28:39vstinnercreate