classification
Title: Py_Finalize() doesn't clear all Python objects at exit
Type: resource usage Stage: patch review
Components: Interpreter Core, Subinterpreters Versions: Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Jim Fasarakis-Hilliard, amaury.forgeotdarc, corona10, eric.snow, isoschiz, kylotan, lukasz.langa, miss-islington, pconnell, phsilva, santoso.wijaya, shihai1991, tlesher, vstinner, ysj.ray
Priority: normal Keywords: patch

Created on 2007-01-15 10:26 by kylotan, last changed 2020-05-31 14:08 by Jim Fasarakis-Hilliard.

Pull Requests
URL Status Linked Edit
PR 17835 merged shihai1991, 2020-01-05 13:28
PR 17883 closed shihai1991, 2020-01-07 00:36
PR 18030 merged shihai1991, 2020-01-16 11:57
PR 18032 closed shihai1991, 2020-01-16 16:27
PR 18049 closed shihai1991, 2020-01-18 10:45
PR 18050 merged shihai1991, 2020-01-18 11:05
PR 18065 merged shihai1991, 2020-01-19 11:02
PR 18066 closed shihai1991, 2020-01-19 11:25
PR 18358 merged shihai1991, 2020-02-05 07:54
PR 18365 closed shihai1991, 2020-02-05 12:50
PR 18374 merged shihai1991, 2020-02-06 08:57
PR 18404 merged shihai1991, 2020-02-07 12:55
PR 18486 merged shihai1991, 2020-02-12 14:31
PR 18608 merged shihai1991, 2020-02-22 14:31
PR 18613 merged shihai1991, 2020-02-23 07:38
PR 19012 merged shihai1991, 2020-03-15 07:38
PR 19015 merged corona10, 2020-03-15 11:53
PR 19018 merged shihai1991, 2020-03-15 14:12
PR 19022 merged miss-islington, 2020-03-15 19:39
PR 19021 merged miss-islington, 2020-03-15 19:39
PR 19044 merged corona10, 2020-03-17 15:18
PR 19057 merged corona10, 2020-03-18 10:50
PR 19069 open shihai1991, 2020-03-19 10:40
PR 19071 closed corona10, 2020-03-19 14:35
PR 19074 merged corona10, 2020-03-19 15:22
PR 19084 merged shihai1991, 2020-03-20 05:31
PR 19100 closed shihai1991, 2020-03-21 10:00
PR 19107 merged phsilva, 2020-03-22 04:03
PR 19122 open phsilva, 2020-03-23 18:31
PR 19128 merged vstinner, 2020-03-23 22:45
PR 19135 merged vstinner, 2020-03-24 15:23
PR 19140 merged vstinner, 2020-03-24 17:05
PR 19150 merged phsilva, 2020-03-25 01:18
PR 19151 merged phsilva, 2020-03-25 01:28
PR 19242 merged corona10, 2020-03-31 12:14
PR 19243 merged corona10, 2020-03-31 13:20
PR 19252 merged shihai1991, 2020-03-31 16:17
PR 19307 merged shihai1991, 2020-04-02 15:30
PR 19382 open corona10, 2020-04-05 03:34
PR 19459 open corona10, 2020-04-10 15:09
PR 19798 merged corona10, 2020-04-29 16:45
PR 19822 merged vstinner, 2020-04-30 21:01
PR 19907 merged corona10, 2020-05-04 18:39
PR 19923 merged corona10, 2020-05-05 12:19
PR 20540 open corona10, 2020-05-30 14:29
Messages (63)
msg61054 - (view) Author: B Sizer (kylotan) Date: 2007-01-15 10:26
This C code:

#include <Python.h>
int main(int argc, char *argv[])
{
    Py_Initialize(); Py_Finalize();
    Py_Initialize(); Py_Finalize();
    Py_Initialize(); Py_Finalize();
    Py_Initialize(); Py_Finalize();
    Py_Initialize(); Py_Finalize();
    Py_Initialize(); Py_Finalize();
    Py_Initialize(); Py_Finalize();
}

Produces this output:
[7438 refs]
[7499 refs]
[7550 refs]
[7601 refs]
[7652 refs]
[7703 refs]
[7754 refs]

A similar program configured to call the Py_Initialize()/Py_Finalize() 1000 times ends up with:
...
[58295 refs]
[58346 refs]
[58397 refs]

This is with a fresh debug build of Python 2.5.0 on Windows XP, using Visual C++ 2003.
msg110895 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2010-07-20 13:41
Does the title of this issue accurately reflect the current status of the Python interpreter?
msg111024 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2010-07-21 09:20
Yes, some objects are not cleaned in finalization.
This is not a problem in usual cases though, when the interpreter is
started only once.
msg130729 - (view) Author: ysj.ray (ysj.ray) Date: 2011-03-13 07:12
> Does the title of this issue accurately reflect the current status of the Python interpreter?

Yes, here is the running result on current 3.3 latest code:
[37182 refs]
[39415 refs]
[41607 refs]
[43799 refs]
[45991 refs]
[48183 refs]
[50375 refs]


This seems to be a known bug that Py_Finalize() doesn't free all objects according doc http://docs.python.org/dev/c-api/init.html?highlight=py_finalize#Py_Finalize
msg248761 - (view) Author: Alex Budovski (Alex Budovski) Date: 2015-08-18 06:20
Interestingly enough, some of the leaked memory came from the finalize routine itself! Here's one example:

0:004> !heap -p -a 0x000000DB144346F0
    address 000000db144346f0 found in
    _HEAP @ db0cae0000
              HEAP_ENTRY Size Prev Flags            UserPtr UserSize - state
        000000db14434690 030a 0000  [00]   000000db144346c0    03074 - (busy)
        7ffc55628b04 ntdll!RtlpCallInterceptRoutine+0x0000000000000040
        7ffc555f9f36 ntdll!RtlAllocateHeap+0x0000000000079836
        7ffc2a60c4da ucrtbased!calloc_base+0x000000000000123a
        7ffc2a60c27d ucrtbased!calloc_base+0x0000000000000fdd
        7ffc2a60f34f ucrtbased!malloc_dbg+0x000000000000002f
        7ffc2a60fdde ucrtbased!malloc+0x000000000000001e
        5a5e6ef9 python36_d!_PyMem_RawMalloc+0x0000000000000029
        5a5e78c7 python36_d!_PyMem_DebugAlloc+0x0000000000000087
        5a5e5e6f python36_d!_PyMem_DebugMalloc+0x000000000000001f
        5a5e7230 python36_d!PyMem_Malloc+0x0000000000000030
        5a582047 python36_d!new_keys_object+0x0000000000000077
        5a57f7c5 python36_d!dictresize+0x0000000000000085
        5a57a4b2 python36_d!PyDict_Merge+0x0000000000000112
        5a57bf33 python36_d!PyDict_Update+0x0000000000000023
        5a75fb1d python36_d!PyImport_Cleanup+0x000000000000045d
        5a778f9e python36_d!Py_Finalize+0x000000000000005e
msg355187 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-10-23 00:11
I tested on the master branch of Python:
---
#include <Python.h>

void func()
{
    Py_Initialize(); Py_Finalize();
    Py_ssize_t cnt = _Py_GetRefTotal();
    printf("sys.gettotalrefcount(): %zd\n", cnt);
}

int main(int argc, char *argv[])
{
    Py_SetProgramName(L"./_testembed");
    for (int i=0; i < 10; i++) {
        func();
    }
}
---

Each iteration leaks around 5,000 Python objects:
---
sys.gettotalrefcount(): 15113
sys.gettotalrefcount(): 19527
sys.gettotalrefcount(): 23941
sys.gettotalrefcount(): 28355
sys.gettotalrefcount(): 32769
sys.gettotalrefcount(): 37183
sys.gettotalrefcount(): 41597
sys.gettotalrefcount(): 46011
sys.gettotalrefcount(): 50425
sys.gettotalrefcount(): 54839
---
msg355189 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-10-23 00:12
I marked bpo-6741 as a duplicate of this issue.
msg355191 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-10-23 00:16
I marked bpo-26888 as a duplicate of this issue.
msg355193 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-10-23 00:17
I marked bpo-21387 as a duplicate of this issue.
msg355194 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-10-23 00:19
One part of this issue is that all C extensions of the stdlib should be updated to implement the PEP 489 "Multi-phase extension module initialization".
msg355201 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-10-23 00:32
I marked bpo-32026 as a duplicate of this issue.
msg359342 - (view) Author: hai shi (shihai1991) * Date: 2020-01-05 13:30
One part of this issue is that all C extensions of the stdlib should be updated to implement the PEP 489 "Multi-phase extension module initialization".
> I try to port _json extension module to multiphase initialization module, but the baseline(using victor's code) in my vm not changed~
msg359482 - (view) Author: hai shi (shihai1991) * Date: 2020-01-07 00:41
Compare to _Py_ForgetReference(), _Py_INC_REFTOTAL in _Py_NewReference() looks redundant.

REF: https://github.com/python/cpython/blob/master/Include/object.h#L442

master brach baseline in my vm:
```
sys.gettotalrefcount(): 18049
sys.gettotalrefcount(): 22463
```

after PR17883
```
sys.gettotalrefcount(): 17589
sys.gettotalrefcount(): 22000
```
msg359830 - (view) Author: hai shi (shihai1991) * Date: 2020-01-12 03:11
FWIW, i counted the difference of each file's refs after `Py_Finalize()`.

[('Objects/dictobject.c', 21434), ('Python/marshal.c', 8135), ('Objects/codeobject.c', 6245), ('Objects/listobject.c', 6037), ('Objects/tupleobject.c', 4169), ('Objects/boolobject.c', 2433), ('Objects/object.c', 2364), ('Objects/unicodeobject.c', 1541), ('Objects/longobject.c', 1387), ('Objects/funcobject.c', 528), ('Objects/classobject.c', 528), ('Objects/abstract.c', 463), ('Python/structmember.c', 369), ('./Include/objimpl.h', 277), ('Objects/stringlib/partition.h', 273), ('Python/import.c', 259), ('Python/codecs.c', 197), ('./Modules/signalmodule.c', 61), ('./Modules/_threadmodule.c', 59), ('Objects/exceptions.c', 15), ('Objects/bytesobject.c', 5), ('./Modules/_weakref.c', 4), ('Python/_warnings.c', 3), ('./Modules/timemodule.c', 1), ('./Modules/_codecsmodule.c', 1), ('Objects/bytearrayobject.c', 1), ('Python/compile.c', 1), ('Objects/sliceobject.c', 0), ('Objects/memoryobject.c', 0), ('Python/context.c', -1), ('Objects/clinic/longobject.c.h', -1), ('Objects/enumobject.c', -1), ('Modules/gcmodule.c', -1), ('Objects/namespaceobject.c', -1), ('Objects/stringlib/unicode_format.h', -2), ('Objects/rangeobject.c', -3), ('Python/pystate.c', -4), ('Objects/fileobject.c', -14), ('./Modules/_io/clinic/bufferedio.c.h', -17), ('./Modules/_io/iobase.c', -21), ('Python/modsupport.c', -28), ('./Modules/_io/fileio.c', -28), ('Python/pylifecycle.c', -37), ('./Modules/_io/textio.c', -39), ('Objects/genobject.c', -53), ('Objects/weakrefobject.c', -54), ('./Modules/_io/bufferedio.c', -56), ('./Python/sysmodule.c', -68), ('./Modules/_io/_iomodule.c', -82), ('Python/errors.c', -90), ('Objects/descrobject.c', -110), ('Objects/structseq.c', -113), ('Python/bltinmodule.c', -118), ('Objects/setobject.c', -339), ('Objects/moduleobject.c', -454), ('./Modules/posixmodule.c', -614), ('./Modules/_abc.c', -664), ('Objects/call.c', -755), ('Objects/typeobject.c', -2035), ('Objects/frameobject.c', -6538), ('Python/ceval.c', -7857), ('./Include/object.h', -48292)]
msg360063 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-15 16:32
New changeset ed154c387efc5f978ec97900ec9e0ec6631d5498 by Victor Stinner (Hai Shi) in branch 'master':
bpo-1635741: Port _json extension module to multiphase initialization (PEP 489) (GH-17835)
https://github.com/python/cpython/commit/ed154c387efc5f978ec97900ec9e0ec6631d5498
msg361427 - (view) Author: hai shi (shihai1991) * Date: 2020-02-05 13:15
i thinkt that not checking `PyModule_AddObject()`'s result may cause this probleam too.

1) python-ast.c have one question, i fix it in PR18358.
2) most of the questions in extension module, for example: https://github.com/python/cpython/blob/master/Modules/gcmodule.c#L2019-L2022
msg361428 - (view) Author: hai shi (shihai1991) * Date: 2020-02-05 13:17
update the above info:
1) python-ast.c have one question, i fix it in PR18365.
msg361466 - (view) Author: hai shi (shihai1991) * Date: 2020-02-06 01:21
> 1) python-ast.c have one question, i fix it in PR18365.
> 2) most of the questions in extension module, for example: https://github.com/python/cpython/blob/master/Modules/gcmodule.c#L2019-L2022

brandt does relevant work already in PR17276PR38823.
msg361798 - (view) Author: miss-islington (miss-islington) Date: 2020-02-11 11:16
New changeset 1ea45ae257971ee7b648e3b031603a31fc059f81 by Hai Shi in branch 'master':
bpo-1635741: Port _codecs extension module to multiphase initialization (PEP 489) (GH-18065)
https://github.com/python/cpython/commit/1ea45ae257971ee7b648e3b031603a31fc059f81
msg362068 - (view) Author: hai shi (shihai1991) * Date: 2020-02-16 12:56
Leave a note for myself:
I check the remaining object roughly(though dump_refs function), most of remaining object is 'str', such as:
'0x7f779cf88880 [13] str'->'0x7f779cf88880 [26] str'

So far, I don't know which file and fileno create those object. MAYBE I need find a hack way to sign this mallocing operation?(not sure)
msg362124 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-02-17 09:11
New changeset b2b6e27bcab44e914d0a0b170e915d6f1604a76d by Hai Shi in branch 'master':
bpo-1635741: Port _crypt extension module to multiphase initialization (PEP 489) (GH-18404)
https://github.com/python/cpython/commit/b2b6e27bcab44e914d0a0b170e915d6f1604a76d
msg362143 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-02-17 13:49
New changeset 7d7956833cc37a9d42807cbfeb7dcc041970f579 by Hai Shi in branch 'master':
bpo-1635741: Port _contextvars module to multiphase initialization (PEP 489) (GH-18374)
https://github.com/python/cpython/commit/7d7956833cc37a9d42807cbfeb7dcc041970f579
msg362144 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-02-17 13:50
New changeset 4c1b6a6f4fc46add0097efb3026cf3f0c89f88a2 by Hai Shi in branch 'master':
bpo-1635741: Port _abc extension to multiphase initialization (PEP 489) (GH-18030)
https://github.com/python/cpython/commit/4c1b6a6f4fc46add0097efb3026cf3f0c89f88a2
msg362195 - (view) Author: miss-islington (miss-islington) Date: 2020-02-18 11:17
New changeset 5d38517aa1836542a5417b724c093bcb245f0f47 by Hai Shi in branch 'master':
bpo-1635741: Port _bz2 extension module to multiphase initialization(PEP 489) (GH-18050)
https://github.com/python/cpython/commit/5d38517aa1836542a5417b724c093bcb245f0f47
msg363935 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-11 16:46
New changeset a158168a787e82c4b7b18f6833153188e93627a5 by Hai Shi in branch 'master':
bpo-1635741: Port _locale extension module to multiphase initialization (PEP 489) (GH-18358)
https://github.com/python/cpython/commit/a158168a787e82c4b7b18f6833153188e93627a5
msg363936 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-11 16:49
New changeset 41fbf865a35d4fb64f047f98dc24690cb0c170fd by Hai Shi in branch 'master':
bpo-1635741: Port audioop extension module to multiphase initialization (PEP 489) (GH-18608)
https://github.com/python/cpython/commit/41fbf865a35d4fb64f047f98dc24690cb0c170fd
msg363937 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-11 16:50
New changeset aa0c0808efbfdee813d2829e49030c667da44e72 by Hai Shi in branch 'master':
bpo-1635741: Fix potential refleaks in binascii module (GH-18613)
https://github.com/python/cpython/commit/aa0c0808efbfdee813d2829e49030c667da44e72
msg363940 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-11 16:53
Thanks Hai Shi for your 3 latest PRs, I merged them.
msg363941 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-11 16:56
New changeset 196f1eb6adcfc6a7239330ef508b8bf9dff9940f by Hai Shi in branch 'master':
bpo-1635741: Fix refleaks of time module error handling (GH-18486)
https://github.com/python/cpython/commit/196f1eb6adcfc6a7239330ef508b8bf9dff9940f
msg364234 - (view) Author: hai shi (shihai1991) * Date: 2020-03-15 14:02
hundreds of encoding names can not be released in Py_Finalize().
for example: 
```
0x7ff482f589e0 [1] 'iso_8859_1_1987'
0x7ff482f58970 [1] 'iso_8859_1'
```
-->
```
0x7ff482f589e0 [2] 'iso_8859_1_1987'
0x7ff482f58970 [2] 'iso_8859_1'
```
msg364330 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-16 15:10
New changeset 356c878fbf2a97aa3ab7951fd7456d219ff0b466 by Dong-hee Na in branch 'master':
bpo-1635741: Port _statistics module to multiphase initialization (GH-19015)
https://github.com/python/cpython/commit/356c878fbf2a97aa3ab7951fd7456d219ff0b466
msg364379 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-17 01:15
New changeset 2037502613471a0a0a0262085cc50adb378ebbad by Hai Shi in branch 'master':
bpo-1635741: Port  _ctypes_test extension to multiphase initialization (PEP 489) (GH-19012)
https://github.com/python/cpython/commit/2037502613471a0a0a0262085cc50adb378ebbad
msg364463 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-17 17:46
New changeset 514c469719f149e1722a91a9d0c63bf89dfefb2a by Dong-hee Na in branch 'master':
bpo-1635741: Port itertools module to multiphase initialization (GH-19044)
https://github.com/python/cpython/commit/514c469719f149e1722a91a9d0c63bf89dfefb2a
msg364521 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-18 14:29
New changeset 4657a8a0d006c76699ba3d1d4d21a04860bb2586 by Dong-hee Na in branch 'master':
bpo-1635741: Port _heapq module to multiphase initialization (GH19057)
https://github.com/python/cpython/commit/4657a8a0d006c76699ba3d1d4d21a04860bb2586
msg364609 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-19 16:16
New changeset 77248a28896d39cae0a7e084965b9ffc2624b7f4 by Dong-hee Na in branch 'master':
bpo-1635741: Port _collections module to multiphase initialization (GH-19074)
https://github.com/python/cpython/commit/77248a28896d39cae0a7e084965b9ffc2624b7f4
msg364656 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-20 08:16
New changeset 8334f30a74abcf7e469b901afc307887aa85a888 by Hai Shi in branch 'master':
bpo-1635741: Port _weakref extension module to multiphase initialization (PEP 489) (GH-19084)
https://github.com/python/cpython/commit/8334f30a74abcf7e469b901afc307887aa85a888
msg364833 - (view) Author: Paulo Henrique Silva (phsilva) * Date: 2020-03-23 01:44
About half of the remaining refs are related to encodings. I noticed that caches on Lib/encodings/__init__.py and codec_search_cach of PyInterpreterState are the places holding the refs. I removed those caches and number went do to:

Before: 4382 refs left
After : 2344 refs left (-46%)

The way to destroy codec_search_cache was recently changed on #36854 and $38962.

(Not proposing to merge this, but my changes are at https://github.com/python/cpython/compare/master...phsilva:remove-codec-caches).
msg364836 - (view) Author: hai shi (shihai1991) * Date: 2020-03-23 05:15
> I noticed that caches on Lib/encodings/__init__.py and codec_search_cach of PyInterpreterState are the places holding the refs. I removed those caches and number went do to.

Good Catch, Paulo.
IMHO, caches is useful in codecs(it's improve the search efficiency).

I have two humble idea:
1. Clean all item of codec_search_xxx in `Py_Finalize()`;
2. change the refcount mechanism(in this case, refcount+1 or refcount+2 make no differenct);
msg364845 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2020-03-23 11:45
The last merged pull request, GH-GH-19084, causes refleaks in importlib tests. Stable buildbots are failing, I can reproduce on macOS Catalina.

You can test yourself by running:
$ ./python.exe -E -Wd -m test -uall,-gui -l -L -R: test_importlib

Master at 2de7ac9798 does not fail while the next commit, 8334f30a74, introduces the failure.
msg364871 - (view) Author: hai shi (shihai1991) * Date: 2020-03-23 18:15
> The last merged pull request, GH-GH-19084, causes refleaks in importlib tests. Stable buildbots are failing, I can reproduce on macOS Catalina.

thanks, Łukasz.
I catched this problem in my vm of centos too. I don't the broken reason temporarily.
msg364883 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-23 18:58
New changeset bd409bb5b78e7ccac5fcda9ab4cec770552f3090 by Paulo Henrique Silva in branch 'master':
bpo-1635741: Port time module to multiphase initialization (PEP 489) (GH-19107)
https://github.com/python/cpython/commit/bd409bb5b78e7ccac5fcda9ab4cec770552f3090
msg364906 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-23 22:42
> The last merged pull request, GH-GH-19084, causes refleaks in importlib tests. Stable buildbots are failing, I can reproduce on macOS Catalina.

I expect that the bug is non-trivial, so I prefer to open a separated issue: bpo-40050 "test_importlib leaked [6303, 6299, 6303] references".
msg364909 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-23 23:48
New changeset 188078c39dec24aa5b3f2073bdc9a68ebaae42de by Victor Stinner in branch 'master':
Revert "bpo-1635741: Port _weakref extension module to multiphase initialization (PEP 489) (GH-19084)" (#19128)
https://github.com/python/cpython/commit/188078c39dec24aa5b3f2073bdc9a68ebaae42de
msg364951 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-24 17:31
New changeset 93460d097f50db0870161a63911d61ce3c5f4583 by Victor Stinner in branch 'master':
bpo-1635741: Port _weakref extension module to multiphase initialization (PEP 489) (GH-19140)
https://github.com/python/cpython/commit/93460d097f50db0870161a63911d61ce3c5f4583
msg364952 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-24 17:32
I managed to identify bpo-40050 (test_importlib reference leak) root issue and to fix it, so I reapplied Hai Shi's change for _weakref.
msg364968 - (view) Author: Paulo Henrique Silva (phsilva) * Date: 2020-03-25 01:46
Updating on my findings on msg364833.

It looks like encodings module is not being destoyed at all and keeping all the encoding refs alive. Looks like some cycle but I am not sure yet how to solve it.

To validate this, I:
 - removed codec_search_cach of PyInterpreterState.
 - Py_DECREFd(encodings) after loading it on codecs.c.

Before: 4376 refs left (37fcbb65d4)
After :  352 refs left (-92%)

I've updated the changes at https://github.com/python/cpython/compare/master...phsilva:remove-codec-caches (not a proposed patch, just to validate the idea)
msg364971 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-25 02:18
New changeset f3d5ac47720045a72f7ef5af13046d9531e6007b by Paulo Henrique Silva in branch 'master':
bpo-1635741: Port operator module to multiphase initialization (PEP 489) (GH-19150)
https://github.com/python/cpython/commit/f3d5ac47720045a72f7ef5af13046d9531e6007b
msg364972 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-25 02:20
New changeset 7dd549eb08939e1927fba818116f5202e76f8d73 by Paulo Henrique Silva in branch 'master':
bpo-1635741: Port _functools module to multiphase initialization (PEP 489) (GH-19151)
https://github.com/python/cpython/commit/7dd549eb08939e1927fba818116f5202e76f8d73
msg364973 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-25 02:41
Hum, some clarification is needed here.

"Port xxx extension module to multiphase initialization (PEP 489)" changes are helping to fix "Py_Finalize() doesn't clear all Python objects at exit", but alone they don't fix all issues.

--

For example, if a module still uses globals using "static ..." in C, these globals will not be cleared magically. Example with _datetimemodule.c:

static PyObject *us_per_hour = NULL;    /* 1e6 * 3600 as Python int */
static PyObject *us_per_day = NULL;     /* 1e6 * 3600 * 24 as Python int */
static PyObject *us_per_week = NULL;    /* 1e6*3600*24*7 as Python int */

These variables initialized once in PyInit__datetime():

    us_per_hour = PyLong_FromDouble(3600000000.0);
    us_per_day = PyLong_FromDouble(86400000000.0);
    us_per_week = PyLong_FromDouble(604800000000.0);

Converting the module to multiphase initialization will not magically clear these variables at exit. The _datetime module should be modified to store these variables in a module state: this module could be cleared at exit.

The binascii is a good example: it has a module state, traverse, clear and free methods, and it uses the multiphase initialization. This module can be fully unloaded at exit.

It's a "simple" module: it doesn't define types for example.

--

Another issue is that converting a module to the multiphase initialization doesn't magically fully isolate two instances of the module. For exmaple, the _abc module still uses a type defined statically:

static PyTypeObject _abc_data_type = {
    PyVarObject_HEAD_INIT(NULL, 0)
    "_abc_data",                        /*tp_name*/
    sizeof(_abc_data),                  /*tp_basicsize*/
    .tp_dealloc = (destructor)abc_data_dealloc,
    .tp_flags = Py_TPFLAGS_DEFAULT,
    .tp_alloc = PyType_GenericAlloc,
    .tp_new = abc_data_new,
};

Example:

vstinner@apu$ ./python
Python 3.9.0a5+ (heads/pr/19122:0ac3031a80, Mar 25 2020, 02:25:19) 
>>> import _abc
>>> class Bla: pass
... 
>>> _abc._abc_init(Bla)
>>> type(Bla._abc_impl)
<class '_abc_data'>

# load a second instance of the module
>>> import sys; del sys.modules['_abc']
>>> import _abc as _abc2
>>> class Bla2: pass
... 
>>> _abc._abc_init(Bla2)

>>> type(Bla2._abc_impl)
<class '_abc_data'>

# _abc and _abc2 have exactly the same type,
# they are not fully isolated
>>> type(Bla2._abc_impl) is type(Bla._abc_impl)
True


That's more an issue for subinterpreters: each interpreter should have its own fully isolated instance of an C extension module.
msg364975 - (view) Author: Paulo Henrique Silva (phsilva) * Date: 2020-03-25 03:23
Thanks for the clarifications. I will keep looking for simple modules, no state and easy to migrate but also dedicate more time to work on the more complex like datetime. I'm working on PR19122 corrections.
msg364987 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-25 12:54
> Thanks for the clarifications. I will keep looking for simple modules, no state and easy to migrate but also dedicate more time to work on the more complex like datetime. I'm working on PR19122 corrections.

I like changes which convert C extension modules to multiphase initialization API since they fix the error path: they implicitly ensures that the module is properly destroyed if something goes wrong.

Moreover, it will ease the work to fix the other issues that I listed.
msg365008 - (view) Author: B Sizer (kylotan) Date: 2020-03-25 18:17
Sorry for the noise, but I just wanted to say thanks to the people working on this issue 13 years after I reported it. :)  Far too many open-source projects arbitrarily close bugs just because they don't have time to fix them and they never get fixed, so I'm glad this wasn't the case here.
msg365017 - (view) Author: hai shi (shihai1991) * Date: 2020-03-25 19:22
>Sorry for the noise, but I just wanted to say thanks to the people working on this issue 13 years after I reported it. :)  Far too many open-source projects arbitrarily close bugs just because they don't have time to fix them and they never get fixed, so I'm glad this wasn't the case here.

cpython is a big family ;)
msg365043 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-26 01:26
> bpo-1635741: Port _functools module to multiphase initialization (PEP 489) (GH-19151)
> https://github.com/python/cpython/commit/7dd549eb08939e1927fba818116f5202e76f8d73

This change introduced a regression: bpo-40071 "test__xxsubinterpreters  leaked [1, 1, 1] references: test_ids_global()".
msg365386 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-31 12:43
New changeset 1cb763b8808745b9a368c1158fda19d329f63f6f by Dong-hee Na in branch 'master':
bpo-1635741: Port _uuid module to multiphase initialization (GH-19242)
https://github.com/python/cpython/commit/1cb763b8808745b9a368c1158fda19d329f63f6f
msg365388 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-31 14:33
New changeset 5be8241392453751beea21d2e32096c15a8d47db by Dong-hee Na in branch 'master':
bpo-1635741: Port math module to multiphase initialization (GH-19243)
https://github.com/python/cpython/commit/5be8241392453751beea21d2e32096c15a8d47db
msg365484 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-04-01 15:12
I created bpo-40137: TODO list when PEP 573 "Module State Access from C Extension Methods" will be implemented.

It tracks code that should be fixed once PEP 573 will be implemented, like _functools and _abc modules.
msg365584 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-04-02 12:35
New changeset 45f7008a66a30cdf749ec03e580bd2692be9a8df by Hai Shi in branch 'master':
bpo-1635741: Port resource extension module to multiphase initialization (PEP 489) (GH-19252)
https://github.com/python/cpython/commit/45f7008a66a30cdf749ec03e580bd2692be9a8df
msg365611 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-04-02 18:00
New changeset 7a6f3bcc43ed729f8038524528c0b326b5610506 by Hai Shi in branch 'master':
bpo-1635741: Fix refleak in _locale init error handling (GH-19307)
https://github.com/python/cpython/commit/7a6f3bcc43ed729f8038524528c0b326b5610506
msg367686 - (view) Author: Dong-hee Na (corona10) * (Python committer) Date: 2020-04-29 18:20
New changeset 84724dd239c30043616487812f6a710b1d70cd4b by Dong-hee Na in branch 'master':
bpo-1635741: Port _stat module to multiphase initialization (GH-19798)
https://github.com/python/cpython/commit/84724dd239c30043616487812f6a710b1d70cd4b
msg367796 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-04-30 22:44
New changeset b66c0ff8af0c1a4adc6908897b2d05afc78cc27e by Victor Stinner in branch 'master':
bpo-1635741: Fix compiler warning in _stat.c (GH-19822)
https://github.com/python/cpython/commit/b66c0ff8af0c1a4adc6908897b2d05afc78cc27e
msg368096 - (view) Author: Dong-hee Na (corona10) * (Python committer) Date: 2020-05-05 01:49
New changeset 92a98ed97513c6e365ce8765550ea65d0ddc8cd7 by Dong-hee Na in branch 'master':
bpo-1635741: Port syslog module to multiphase initialization (GH-19907)
https://github.com/python/cpython/commit/92a98ed97513c6e365ce8765550ea65d0ddc8cd7
msg368318 - (view) Author: Dong-hee Na (corona10) * (Python committer) Date: 2020-05-07 01:17
New changeset 3466922320d54a922cfe6d6d44e89e1cea4023ef by Dong-hee Na in branch 'master':
bpo-1635741: Port errno module to multiphase initialization (GH-19923)
https://github.com/python/cpython/commit/3466922320d54a922cfe6d6d44e89e1cea4023ef
History
Date User Action Args
2020-05-31 14:08:38Jim Fasarakis-Hilliardsetnosy: + Jim Fasarakis-Hilliard
2020-05-30 14:29:34corona10setpull_requests: + pull_request19783
2020-05-18 13:05:53vstinnersetcomponents: + Subinterpreters
2020-05-07 01:17:23corona10setmessages: + msg368318
2020-05-05 12:19:46corona10setpull_requests: + pull_request19238
2020-05-05 01:49:52corona10setmessages: + msg368096
2020-05-04 18:39:07corona10setpull_requests: + pull_request19222
2020-04-30 22:44:06vstinnersetmessages: + msg367796
2020-04-30 21:01:40vstinnersetpull_requests: + pull_request19142
2020-04-29 18:20:34corona10setmessages: + msg367686
2020-04-29 16:45:27corona10setpull_requests: + pull_request19119
2020-04-10 15:09:27corona10setpull_requests: + pull_request18815
2020-04-05 03:34:49corona10setpull_requests: + pull_request18745
2020-04-04 17:51:43corona10setpull_requests: - pull_request18728
2020-04-04 17:41:31corona10setpull_requests: + pull_request18728
2020-04-02 18:00:56vstinnersetmessages: + msg365611
2020-04-02 15:30:23shihai1991setpull_requests: + pull_request18669
2020-04-02 12:35:26vstinnersetmessages: + msg365584
2020-04-01 15:12:34vstinnersetmessages: + msg365484
2020-03-31 16:17:39shihai1991setpull_requests: + pull_request18610
2020-03-31 14:33:29vstinnersetmessages: + msg365388
2020-03-31 13:20:32corona10setpull_requests: + pull_request18602
2020-03-31 12:43:53vstinnersetmessages: + msg365386
2020-03-31 12:14:26corona10setpull_requests: + pull_request18601
2020-03-26 01:26:13vstinnersetmessages: + msg365043
2020-03-25 19:22:48shihai1991setmessages: + msg365017
2020-03-25 18:17:35kylotansetmessages: + msg365008
2020-03-25 12:54:50vstinnersetmessages: + msg364987
2020-03-25 03:23:33phsilvasetmessages: + msg364975
2020-03-25 02:41:53vstinnersetmessages: + msg364973
2020-03-25 02:20:06vstinnersetmessages: + msg364972
2020-03-25 02:18:53vstinnersetmessages: + msg364971
2020-03-25 01:46:40phsilvasetmessages: + msg364968
2020-03-25 01:28:25phsilvasetpull_requests: + pull_request18512
2020-03-25 01:18:23phsilvasetpull_requests: + pull_request18511
2020-03-24 17:34:09Alex Budovskisetnosy: - Alex Budovski
2020-03-24 17:32:49vstinnersetmessages: + msg364952
2020-03-24 17:31:25vstinnersetmessages: + msg364951
2020-03-24 17:05:25vstinnersetpull_requests: + pull_request18501
2020-03-24 15:23:47vstinnersetpull_requests: + pull_request18497
2020-03-23 23:48:06vstinnersetmessages: + msg364909
2020-03-23 22:45:33vstinnersetpull_requests: + pull_request18489
2020-03-23 22:42:55vstinnersetmessages: + msg364906
2020-03-23 18:58:31vstinnersetmessages: + msg364883
2020-03-23 18:31:28phsilvasetpull_requests: + pull_request18483
2020-03-23 18:15:55shihai1991setmessages: + msg364871
2020-03-23 11:45:45lukasz.langasetnosy: + lukasz.langa
messages: + msg364845
2020-03-23 05:15:30shihai1991setmessages: + msg364836
2020-03-23 01:44:12phsilvasetmessages: + msg364833
2020-03-22 04:03:53phsilvasetpull_requests: + pull_request18468
2020-03-21 10:00:25shihai1991setpull_requests: + pull_request18460
2020-03-20 08:16:58vstinnersetmessages: + msg364656
2020-03-20 05:31:01shihai1991setpull_requests: + pull_request18444
2020-03-19 16:16:12vstinnersetmessages: + msg364609
2020-03-19 15:22:01corona10setpull_requests: + pull_request18430
2020-03-19 14:35:23corona10setpull_requests: + pull_request18426
2020-03-19 10:40:07shihai1991setpull_requests: + pull_request18423
2020-03-18 14:29:38vstinnersetmessages: + msg364521
2020-03-18 10:50:30corona10setpull_requests: + pull_request18409
2020-03-17 17:46:32vstinnersetmessages: + msg364463
2020-03-17 15:18:25corona10setpull_requests: + pull_request18395
2020-03-17 01:15:32vstinnersetmessages: + msg364379
2020-03-16 15:10:28vstinnersetmessages: + msg364330
2020-03-15 19:39:25miss-islingtonsetpull_requests: + pull_request18371
2020-03-15 19:39:20miss-islingtonsetpull_requests: + pull_request18370
2020-03-15 14:12:53shihai1991setpull_requests: + pull_request18367
2020-03-15 14:02:30shihai1991setmessages: + msg364234
2020-03-15 11:53:21corona10setnosy: + corona10
pull_requests: + pull_request18363
2020-03-15 07:38:55shihai1991setpull_requests: + pull_request18359
2020-03-11 16:56:21vstinnersetmessages: + msg363941
2020-03-11 16:53:30vstinnersetmessages: + msg363940
2020-03-11 16:50:59vstinnersetmessages: + msg363937
2020-03-11 16:49:15vstinnersetmessages: + msg363936
2020-03-11 16:46:10vstinnersetmessages: + msg363935
2020-02-25 09:04:42phsilvasetnosy: + phsilva
2020-02-23 07:38:38shihai1991setpull_requests: + pull_request17978
2020-02-22 14:31:56shihai1991setpull_requests: + pull_request17974
2020-02-18 11:17:45miss-islingtonsetmessages: + msg362195
2020-02-17 13:50:39vstinnersetmessages: + msg362144
2020-02-17 13:49:33vstinnersetmessages: + msg362143
2020-02-17 09:11:37vstinnersetmessages: + msg362124
2020-02-16 12:56:28shihai1991setmessages: + msg362068
2020-02-12 14:31:44shihai1991setpull_requests: + pull_request17859
2020-02-11 11:16:45miss-islingtonsetnosy: + miss-islington
messages: + msg361798
2020-02-07 17:38:13eric.snowsetnosy: + eric.snow
2020-02-07 12:55:50shihai1991setpull_requests: + pull_request17780
2020-02-06 08:57:42shihai1991setpull_requests: + pull_request17750
2020-02-06 01:21:01shihai1991setmessages: + msg361466
2020-02-05 13:17:36shihai1991setmessages: + msg361428
2020-02-05 13:15:21shihai1991setmessages: + msg361427
2020-02-05 12:50:11shihai1991setpull_requests: + pull_request17740
2020-02-05 07:54:13shihai1991setpull_requests: + pull_request17733
2020-01-19 11:25:02shihai1991setpull_requests: + pull_request17457
2020-01-19 11:02:16shihai1991setpull_requests: + pull_request17456
2020-01-18 11:05:22shihai1991setpull_requests: + pull_request17445
2020-01-18 10:45:16shihai1991setpull_requests: + pull_request17444
2020-01-16 16:27:33shihai1991setpull_requests: + pull_request17428
2020-01-16 11:57:57shihai1991setpull_requests: + pull_request17425
2020-01-15 16:32:55vstinnersetmessages: + msg360063
2020-01-12 03:11:46shihai1991setmessages: + msg359830
2020-01-07 00:41:22shihai1991setmessages: + msg359482
2020-01-07 00:36:19shihai1991setpull_requests: + pull_request17299
2020-01-05 13:30:21shihai1991setnosy: + shihai1991
messages: + msg359342
2020-01-05 13:28:20shihai1991setkeywords: + patch
stage: test needed -> patch review
pull_requests: + pull_request17262
2019-10-23 00:32:00vstinnersetmessages: + msg355201
2019-10-23 00:31:44vstinnerlinkissue32026 superseder
2019-10-23 00:19:27vstinnersetmessages: + msg355194
2019-10-23 00:17:06vstinnersetmessages: + msg355193
2019-10-23 00:16:52vstinnerlinkissue21387 superseder
2019-10-23 00:16:01vstinnersetmessages: + msg355191
2019-10-23 00:15:53vstinnerlinkissue26888 superseder
2019-10-23 00:12:59vstinnersetmessages: + msg355189
2019-10-23 00:12:45vstinnerlinkissue6741 superseder
2019-10-23 00:11:30vstinnersetnosy: + vstinner
title: Interpreter seems to leak references after finalization -> Py_Finalize() doesn't clear all Python objects at exit
messages: + msg355187

versions: + Python 3.9, - Python 3.1, Python 2.7, Python 3.2
2015-08-18 06:20:57Alex Budovskisetnosy: + Alex Budovski
messages: + msg248761
2014-02-03 18:32:01BreamoreBoysetnosy: - BreamoreBoy
2013-04-20 10:04:07isoschizsetnosy: + pconnell, isoschiz
2011-03-31 11:49:57tleshersetnosy: + tlesher
2011-03-13 09:18:08santoso.wijayasetnosy: + santoso.wijaya
2011-03-13 07:12:36ysj.raysetnosy: amaury.forgeotdarc, kylotan, ysj.ray, BreamoreBoy
messages: + msg130729
2011-02-14 03:07:28ysj.raysetnosy: + ysj.ray
2010-07-21 10:21:13amaury.forgeotdarclinkissue8258 superseder
2010-07-21 09:20:21amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg111024
2010-07-20 13:41:15BreamoreBoysetnosy: + BreamoreBoy

messages: + msg110895
versions: + Python 3.1, Python 2.7, Python 3.2, - Python 2.6, Python 3.0
2009-03-30 19:04:21ajaksu2setstage: test needed
type: resource usage
versions: + Python 2.6, Python 3.0, - Python 2.5
2007-01-15 10:26:05kylotancreate