This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients vstinner
Date 2020-12-17.21:13:19
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1608239600.62.0.0456277879965.issue42671@roundup.psfhosted.org>
In-reply-to
Content
At exit, Python calls Py_Finalize() which tries to clear every single Python objects. The order is which Python objects are cleared is not fully deterministic. Py_Finalize() uses an heuristic to attempt to clear modules of sys.modules in the "best" order.

The current code creates a weak reference to a module, set sys.modules[name] to None, and then clears the module attribute if and only if the module object was not destroyed (if the weak reference still points to the module).

The problem is that even if a module object is destroyed, the module dictionary can remain alive thanks for various kinds of strong references to it.

Worst case example:
---
class VerboseDel:
    def __del__(self):
        print("Goodbye Cruel World")
obj = VerboseDel()

def func():
    pass

import os
os.register_at_fork(after_in_child=func)
del os
del func

print("exit")
---

Output:
---
$ python3.9 script.py
exit
---

=> The VerboseDel object is never destroyed :-( BUG!


Explanation:

* os.register_at_fork(after_in_child=func) stores func in PyInterpreterState.after_forkers_child -> func() is kept alive until interpreter_clear() calls Py_CLEAR(interp->after_forkers_child);

* func() has reference to the module dictionary

I'm not sure why the VerboseDel object is not destroyed.


I propose to rewrite the finalize_modules() to clear modules in a more deterministic order:

* start by clearing __main__ module variables
* then iterate on reversed(sys.modules.values()) and clear the module variables
* Module attributes are cleared by _PyModule_ClearDict(): iterate on reversed(module.__dict__) and set dict values to None


Drawback: it is a backward incompatible change. Code which worked by luck previously no longer works. I'm talking about applications which rely on __del__() methods being calling in an exact order and expect Python being in a specific state.

Example:
---
class VerboseDel:
    def __init__(self, name):
        self.name = name
    def __del__(self):
        print(self.name)

a = VerboseDel("a")
b = VerboseDel("b")
c = VerboseDel("c")
---

Output:
---
c
b
a
---

=> Module attributes are deleted in the reverse order of their definition: the most recent object is deleted first, the oldest is deleted last.


Example 2 with 3 modules (4 files):
---
$ cat a.py 
class VerboseDel:
    def __init__(self, name):
        self.name = name
    def __del__(self):
        print(self.name)

a = VerboseDel("a")


$ cat b.py 
class VerboseDel:
    def __init__(self, name):
        self.name = name
    def __del__(self):
        print(self.name)

b = VerboseDel("b")


$ cat c.py 
class VerboseDel:
    def __init__(self, name):
        self.name = name
    def __del__(self):
        print(self.name)

c = VerboseDel("c")


$ cat z.py 
import a
import b
import c
---

Output:
---
$ ./python z.py 
c
b
a
---

=> Modules are deleted from the most recently imported (import c) to the least recently imported module (import a).
History
Date User Action Args
2020-12-17 21:13:20vstinnersetrecipients: + vstinner
2020-12-17 21:13:20vstinnersetmessageid: <1608239600.62.0.0456277879965.issue42671@roundup.psfhosted.org>
2020-12-17 21:13:20vstinnerlinkissue42671 messages
2020-12-17 21:13:19vstinnercreate