classification
Title: Clean modules in the reversed order
Type: enhancement Stage: resolved
Components: Interpreter Core Versions: Python 3.8
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: barry, brett.cannon, eric.snow, ncoghlan, pitrou, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2018-04-22 10:38 by serhiy.storchaka, last changed 2018-10-29 18:14 by serhiy.storchaka. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 6565 merged serhiy.storchaka, 2018-04-22 10:44
Messages (7)
msg315603 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-04-22 10:38
Since dict is ordered, modules in sys.modules are ordered for the time of importing. Currently they are cleaned in PyImport_Cleanup() in the direct order -- from imported first to imported later. I wonder if cleaning them in the reversed order can solve some problems with the interpreter shutdown.

For example reverting the order fixes issue33328 and may help in other cases.

If revert the order, should only iterating weaklist be reverted (with setting all module globals to None), or iterating sys.modules (with setting sys.module values to None) too?
msg319135 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-06-09 04:20
I think the key concern here is that we *don't* consistently add modules to sys.modules in the order their bodies are executed: we add them in a kind of sawtooth order based on the imports from __main__, and the order of import statements in the imported modules.

For example, given the following import chains:

    __main__ imports A then X
    A imports B which imports C which imports D
    X imports Y which imports B
    X then imports Z which imports C

Then the order in which modules get added to sys.modules will be:

    __main__, A, B, C, D, X, Y, Z

and they'll get cleaned up from left to right

(We're making the assumption here that, for whatever reason, GC hasn't cleaned up A, X, and their dependencies after sys.modules got cleared)

This means that in the status quo, unloading X, Y, and Z can have problems, since B, C, and D will already be gone.

Reversing the order doesn't fix that, and if anything will make things worse, as it means that in the "A -> B -> C -> D" dependency chain, A now gets cleared *last*, instead of getting cleared first as it does today.

So instead of just reversing the order, I wondering if what we may want to consider doing is to:

1. Make sure that we're dropping all the interpreter's internal references to __main__ before clearing sys.modules
2. When running through the module weakref list, force another cyclic GC run after each not-yet-cleared module gets cleaned up
msg319136 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-06-09 04:38
It also occurred to me to ask whether or not moving modules to the end of sys.modules each time they're successfully imported would help solve the problem (albeit at the expense of making import cache hits more expensive).

I don't think it does, though, since in my example above, the least-recently-imported ordering would end up looking like:

    __main__, A, D, X, Y, B, Z, C

Since D was only imported by C, and hence never gets moved to the end later, even when C gets moved by the import from Z.

Instead, if we truly wanted to solve the problem comprehensively, we'd need to:

1. Use a context variable to track a stack of "currently running imports"
2. Track (as a process global) a directed (potentially cyclic!) graph of which modules imported other modules (i.e. the import of the target module started while the importing module was the currently active import on the stack). Lazy imports would typically show up as being initiated by __main__.
3. During shutdown, linearise the shutdown order for any modules which weren't already cleaned up by the cyclic GC.

Taking better advantage of the existing cyclic GC seems like it should be simpler, though, and would hopefully be enough to handle all but the most complex edge cases.
msg326798 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-10-01 13:29
If move a module to the end of sys.modules when it have finished execution, we will get the following order:

    D, C, B, A, Y, Z, X, __main__

In can be cleaned from the end, and each module will be cleaned before its dependencies.
msg327037 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-10-04 05:58
Aye, that option sounds like it would work to me (as long as throwing an
exception is counted as finishing execution, so the failed module gets
moved to the end before getting cleaned up)
msg327048 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-10-04 10:53
While this change fixes issue33328, it doesn't help with the similar issue13044 which is more complex. Still I think there is a value of wiping modules in such order.
msg328844 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-10-29 17:30
New changeset c93c58b5d560cfe44d9884ff02c9b18e06333360 by Serhiy Storchaka in branch 'master':
bpo-33331: Clean modules in the reversed order in PyImport_Cleanup(). (GH-6565)
https://github.com/python/cpython/commit/c93c58b5d560cfe44d9884ff02c9b18e06333360
History
Date User Action Args
2018-10-29 18:14:30serhiy.storchakasetstatus: open -> closed
type: enhancement
stage: patch review -> resolved
resolution: fixed
versions: - Python 3.6, Python 3.7
2018-10-29 17:30:28serhiy.storchakasetmessages: + msg328844
2018-10-04 10:53:48serhiy.storchakasetmessages: + msg327048
2018-10-04 05:58:17ncoghlansetmessages: + msg327037
2018-10-01 13:29:48serhiy.storchakasetmessages: + msg326798
2018-06-09 04:38:42ncoghlansetmessages: + msg319136
2018-06-09 04:20:56ncoghlansetmessages: + msg319135
2018-04-22 16:52:33barrysetnosy: + barry
2018-04-22 10:44:27serhiy.storchakasetkeywords: + patch
stage: patch review
pull_requests: + pull_request6260
2018-04-22 10:38:18serhiy.storchakacreate