classification
Title: Make the Python finalization more deterministic
Type: Stage: resolved
Components: Interpreter Core Versions: Python 3.10
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: eric.snow, pitrou, serhiy.storchaka, vstinner
Priority: normal Keywords: patch

Created on 2020-12-17 21:13 by vstinner, last changed 2021-09-21 22:04 by vstinner. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 23826 closed vstinner, 2020-12-17 21:46
Messages (9)
msg383265 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-12-17 21:13
At exit, Python calls Py_Finalize() which tries to clear every single Python objects. The order is which Python objects are cleared is not fully deterministic. Py_Finalize() uses an heuristic to attempt to clear modules of sys.modules in the "best" order.

The current code creates a weak reference to a module, set sys.modules[name] to None, and then clears the module attribute if and only if the module object was not destroyed (if the weak reference still points to the module).

The problem is that even if a module object is destroyed, the module dictionary can remain alive thanks for various kinds of strong references to it.

Worst case example:
---
class VerboseDel:
    def __del__(self):
        print("Goodbye Cruel World")
obj = VerboseDel()

def func():
    pass

import os
os.register_at_fork(after_in_child=func)
del os
del func

print("exit")
---

Output:
---
$ python3.9 script.py
exit
---

=> The VerboseDel object is never destroyed :-( BUG!


Explanation:

* os.register_at_fork(after_in_child=func) stores func in PyInterpreterState.after_forkers_child -> func() is kept alive until interpreter_clear() calls Py_CLEAR(interp->after_forkers_child);

* func() has reference to the module dictionary

I'm not sure why the VerboseDel object is not destroyed.


I propose to rewrite the finalize_modules() to clear modules in a more deterministic order:

* start by clearing __main__ module variables
* then iterate on reversed(sys.modules.values()) and clear the module variables
* Module attributes are cleared by _PyModule_ClearDict(): iterate on reversed(module.__dict__) and set dict values to None


Drawback: it is a backward incompatible change. Code which worked by luck previously no longer works. I'm talking about applications which rely on __del__() methods being calling in an exact order and expect Python being in a specific state.

Example:
---
class VerboseDel:
    def __init__(self, name):
        self.name = name
    def __del__(self):
        print(self.name)

a = VerboseDel("a")
b = VerboseDel("b")
c = VerboseDel("c")
---

Output:
---
c
b
a
---

=> Module attributes are deleted in the reverse order of their definition: the most recent object is deleted first, the oldest is deleted last.


Example 2 with 3 modules (4 files):
---
$ cat a.py 
class VerboseDel:
    def __init__(self, name):
        self.name = name
    def __del__(self):
        print(self.name)

a = VerboseDel("a")


$ cat b.py 
class VerboseDel:
    def __init__(self, name):
        self.name = name
    def __del__(self):
        print(self.name)

b = VerboseDel("b")


$ cat c.py 
class VerboseDel:
    def __init__(self, name):
        self.name = name
    def __del__(self):
        print(self.name)

c = VerboseDel("c")


$ cat z.py 
import a
import b
import c
---

Output:
---
$ ./python z.py 
c
b
a
---

=> Modules are deleted from the most recently imported (import c) to the least recently imported module (import a).
msg383269 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-12-17 21:56
I wrote this issue to attempt to fix a leak in PR 23811 which converts the _thread extension module to the multiphase initialization API (PEP 489).
msg383270 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-12-17 22:00
Modules are already cleared in the reverse order.

Clearing module attributes in the reverse order looks interesting. It increases chance that modules and other imported names which can be used in __del__ will be deleted after executing __del__ if the module has references to an instance of a class defined in the same module. And if the class was defined in other module, it will be deleted later because dependencies are cleared after depended modules.

On other hand, private attributes are currently removed first, so we perhaps need to change it to make the strategy working.
msg383271 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-12-17 22:05
> Modules are already cleared in the reverse order.

In Python 3.9, _PyImport_Cleanup() of Python/import.c contains the comment:

        /* Since dict is ordered in CPython 3.6+, modules are saved in
           importing order.  First clear modules imported later. */

But using Python 3.9, the z.py example output is:
---
a
b
c
---

The most recently imported module is the last one to be deleted.

The first loop iterates on sys.modules, not on reversed(sys.modules).
msg383272 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-12-17 22:08
I took some notes on latest changes of the Python finalization order:
https://pythondev.readthedocs.io/finalization.html#reorder-python-finalization

Not directly related, I also took notes on the "weird GC behavior during Python finalization" (such as the issue that I have with my _thread change):
https://pythondev.readthedocs.io/finalization.html#weird-gc-behavior-during-python-finalization
msg383288 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-12-18 07:36
I experimented with different approaches last evening. Just changing the order of clearing the module dict does not help your first example, because the __main__ module is garbage collected, but its dict is not cleared. Adding _PyModule_ClearDict() in tp_dealloc and tp_clear helps with that example, but it causes several failures and crashes in tests. I have found issue7140 and issue18214 for not clearing module dicts.

So in conclusion, your approach may work. It clears dicts of all modules, but only at shutdown. It did not worked before, when the order of dicts was undetermined, but now we can utilize deterministic order of sys.modules and module dicts.
msg383295 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-12-18 10:47
> I wrote this issue to attempt to fix a leak in PR 23811 which converts the _thread extension module to the multiphase initialization API (PEP 489).

I fixed this bug by adding a traverse function to the _thread lock type. So I don't *need* this issue to unblock PR 23811 anymore.

But I'm still interested to make the finalization more determistic.
msg383296 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-12-18 10:51
> Drawback: it is a backward incompatible change. Code which worked by luck previously no longer works. I'm talking about applications which rely on __del__() methods being calling in an exact order and expect Python being in a specific state.

Python does not provide any warranty in which order finalizers are called:
https://docs.python.org/dev/reference/datamodel.html#object.__del__

PyPy is a good concrete example of finalizers being called in a different order.

An application must not rely on the finalizer order, but manage resources explicitly by calling close() methods or using context managers. Python emits ResourceWarning on this purpose.
msg402382 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-09-21 22:04
I'm not really excited about breaking applications. While there is maybe a way to make the Python finalization more determistic, sorry, I'm not the one who will be brave enough to do it today. I abandon my idea for now.
History
Date User Action Args
2021-09-21 22:04:23vstinnersetstatus: open -> closed
resolution: rejected
messages: + msg402382

stage: patch review -> resolved
2020-12-18 10:51:28vstinnersetmessages: + msg383296
2020-12-18 10:47:41vstinnersetmessages: + msg383295
2020-12-18 07:36:43serhiy.storchakasetnosy: + pitrou, eric.snow
messages: + msg383288
2020-12-17 22:08:08vstinnersetmessages: + msg383272
2020-12-17 22:05:48vstinnersetmessages: + msg383271
2020-12-17 22:00:23serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg383270
2020-12-17 21:56:05vstinnersetmessages: + msg383269
2020-12-17 21:46:17vstinnersetkeywords: + patch
stage: patch review
pull_requests: + pull_request22688
2020-12-17 21:13:20vstinnercreate