This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients corona10, erlendaasland, vstinner
Date 2021-01-08.12:23:51
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1610108631.55.0.0877328017147.issue42866@roundup.psfhosted.org>
In-reply-to
Content
> encodings._cache['cp932'] = _codecs_jp.getcodec('cp932')

* encodings._cache is kept alive by encodings.search_function.__globals__
* encodings.search_function function is kept alive by PyInterpreterState.codec_search_path list. The function by _PyCodec_Register() in encodings/__init__.py: codecs.register(search_function).

For example, unregistering the search function prevents the leak:

            import encodings
            import _codecs_jp
            encodings._cache['cp932'] = _codecs_jp.getcodec('cp932')

            import codecs
            codecs.unregister(encodings.search_function)

The PyInterpreterState.codec_search_path list is cleared at Python exit by interpreter_clear().

The weird part is that the _codecs_jp.getcodec('cp932') codec object *is* deleted. I checked and multibytecodec_dealloc() is called with the object stored in the encodings cache.

A _multibytecodec.MultibyteCodec instance (MultibyteCodecObject* structure in C) is a simple type: it only stores pointer to C functions and C strings. It doesn't contain any Python object. So I don't see how it could be part of a reference cycle by itself. Moreover, again, it is deleted.
History
Date User Action Args
2021-01-08 12:23:51vstinnersetrecipients: + vstinner, corona10, erlendaasland
2021-01-08 12:23:51vstinnersetmessageid: <1610108631.55.0.0877328017147.issue42866@roundup.psfhosted.org>
2021-01-08 12:23:51vstinnerlinkissue42866 messages
2021-01-08 12:23:51vstinnercreate