Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

module shutdown procedure based on GC #39300

Closed
arigo mannequin opened this issue Sep 25, 2003 · 24 comments
Closed

module shutdown procedure based on GC #39300

arigo mannequin opened this issue Sep 25, 2003 · 24 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error

Comments

@arigo
Copy link
Mannequin

arigo mannequin commented Sep 25, 2003

BPO 812369
Nosy @loewis, @brettcannon, @arigo, @nascheme, @gpshead, @ncoghlan, @abalkin, @pitrou, @tiran, @asvetlov, @florentx, @cburroughs, @ericsnowcurrently
Superseder
  • bpo-18214: Stop purging modules which are garbage collected before shutdown
  • Files
  • x.txt: patch
  • 0001-update-GC-shutdown-patch.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2013-07-30.18:34:23.546>
    created_at = <Date 2003-09-25.10:49:56.000>
    labels = ['interpreter-core', 'type-bug']
    title = 'module shutdown procedure based on GC'
    updated_at = <Date 2013-08-21.06:13:40.226>
    user = 'https://github.com/arigo'

    bugs.python.org fields:

    activity = <Date 2013-08-21.06:13:40.226>
    actor = 'pitrou'
    assignee = 'none'
    closed = True
    closed_date = <Date 2013-07-30.18:34:23.546>
    closer = 'pitrou'
    components = ['Interpreter Core']
    creation = <Date 2003-09-25.10:49:56.000>
    creator = 'arigo'
    dependencies = []
    files = ['5605', '13091']
    hgrepos = []
    issue_num = 812369
    keywords = ['patch']
    message_count = 24.0
    messages = ['44689', '44690', '44691', '44692', '59257', '79671', '82136', '84775', '94027', '94033', '110337', '114285', '118245', '118259', '153530', '153576', '153603', '154094', '154122', '172099', '172106', '172165', '179098', '193946']
    nosy_count = 17.0
    nosy_names = ['loewis', 'brett.cannon', 'arigo', 'nascheme', 'glchapman', 'gregory.p.smith', 'ncoghlan', 'belopolsky', 'pitrou', 'christian.heimes', 'andrea.corbellini', 'asvetlov', 'flox', 'cburroughs', 'BreamoreBoy', 'eric.snow', 'Stefan.Friesel']
    pr_nums = []
    priority = 'normal'
    resolution = 'duplicate'
    stage = 'patch review'
    status = 'closed'
    superseder = '18214'
    type = 'behavior'
    url = 'https://bugs.python.org/issue812369'
    versions = ['Python 3.2']

    @arigo
    Copy link
    Mannequin Author

    arigo mannequin commented Sep 25, 2003

    This patches changes PyImport_Cleanup() in an attempt
    to make the module shutdown order more predictable in
    the presence of modules that import each other. Before
    it explicitely clears the globals of the modules, it
    relies on the GC to try to do it more cleanly.

    http://mail.python.org/pipermail/python-dev/2003-September/038238.html

    To prevent objects with __del__ methods from keeping
    whole modules alive I actually replace each module with
    a weak reference to it in sys.modules. This allows me
    to find modules that remain alive after a call to
    PyGC_Collect(), and then go back to the old technique
    of clearing their globals.

    Note that weak references to dead cycles containing
    objects with finalizers have a strange property in
    Python: if you use the weak reference again you can
    break the cycles, but the objects with finalizers still
    won't be collected because they are in gc.garbage. As
    this is exactly what occurs above, I clear the
    gc.garbage list explicitely before the final
    PyGC_Collect() call. I'm not sure exactly what this
    might do; could it release older objects that were
    never put in a module but that at some time were put in
    gc.garbage and whose cycles were later broken? If so,
    is it a good thing to release them now? (Would it make
    sense to clear gc.garbage and call gc.collect again
    from time to time to check if the objects are still in
    a cycle?)

    This patch does not change the behavior of module
    objects clearing their globals dictionary as soon as
    they are deallocated. This could be work investigating.

    This patch puts weak references (actually proxies) in
    sys.modules but only at shutdown time. Moure thoughts
    could be given towards allowing weak references during
    normal program execution as well. To do so we must
    ensure that 'import' statements return the real module,
    not the weak proxy object, which is more difficult than
    it first seems in the presence of packages.

    And finally -- this patch has not really been tested,
    apart from the fact that it passes the test suite.

    @arigo arigo mannequin added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Sep 25, 2003
    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Sep 27, 2003

    Logged In: YES
    user_id=21627

    I think clearing gc.garbage at shutdown time is a good idea;
    gc.garbage is mostly a debugging aid, and should be empty in
    production code. If it still contains objects at shutdown
    time, it is IMO reasonable to give them a chance to become
    collected, in case somebody broke their cycles.

    @glchapman
    Copy link
    Mannequin

    glchapman mannequin commented Mar 11, 2004

    Logged In: YES
    user_id=86307

    Not sure if this is a good idea, but I wonder if the extensions
    dictionary should be cleared (ie _PyImport_Fini called)
    sooner, possibly even before PyImport_Cleanup. This would
    allow garbage collection during PyImport_Cleanup to catch
    anything a C module might have created which is in a cycle
    with its module (through a bad design on my part, I recently
    discovered I had created just such a cycle).

    I suppose _PyImport_Fini is called when it is called because
    some code during PyImport_Cleanup might import a
    dynamic module, which would then create a new extensions
    dictionary if _PyImport_Fini had already been run. Perhaps
    a flag could be added so that _PyImport_FixupExtension
    would not try to add a module's dict to extensions if Python
    is currently shutting down.

    @brettcannon
    Copy link
    Member

    Logged In: YES
    user_id=357491

    Just so you know, Armin, the patch did not apply cleanly; the comment
    for pythonrun.c did not apply. I also need to add an extern declaration in
    import.c for _PyGC_garbage for the code to compile (OS X 10.3, gcc
    3.3).

    @tiran
    Copy link
    Member

    tiran commented Jan 4, 2008

    Consider this patch for 2.6 and discuss it at the bug day.

    @pitrou
    Copy link
    Member

    pitrou commented Jan 12, 2009

    Looking at the patch, is there any reason it doesn't get rid of the
    current _PyModule_Clear() implementation to replace it by a call to
    PyDict_Clear() followed by PyGC_Collect()?
    (the call to PyGC_Collect could be disabled while finalizing, because
    there's no use calling it as many times as there are modules to be
    disbanded)

    The major annoyance with the current scheme is that, at interpreter
    shutdown, some globals you want to rely on in your destructors suddenly
    become None.

    About what to do of gc.garbage at shutdown, there was another proposal
    in bpo-477863.

    @nascheme
    Copy link
    Member

    This sounds like a nice idea. The current cleanup procedure in
    pythonrun.c is pretty lame since it can play havoc with __del__ methods
    (e.g. if they run after globals have been cleared).

    I've updated the patch to work with the current SVN head. Probably this
    should get tested with big applications based on Zope, Django, etc.

    @pitrou
    Copy link
    Member

    pitrou commented Mar 31, 2009

    Retargetting, and I hope someone can take a look at the patch and give
    it the green light :-)

    @pitrou pitrou added the type-bug An unexpected behavior, bug, or error label Mar 31, 2009
    @abalkin
    Copy link
    Member

    abalkin commented Oct 14, 2009

    Does this patch fix bpo-1545463 by any chance? I am away from a
    development box ATM and cannot test the patch myself.

    @gpshead gpshead self-assigned this Oct 14, 2009
    @nascheme
    Copy link
    Member

    It should fix bpo-1545463 and running a quick test seems to show that
    it does.

    @BreamoreBoy
    Copy link
    Mannequin

    BreamoreBoy mannequin commented Jul 14, 2010

    bpo-1545463 has been closed as "won't fix", so wouldn't implementing this patch kill two birds with one stone?

    @BreamoreBoy
    Copy link
    Mannequin

    BreamoreBoy mannequin commented Aug 18, 2010

    bpo-1545463 has been reopened with comments about being used as a stop gap. Possibly review and implementation of the patch here would be a better option, sorry it's over my head.

    @gpshead
    Copy link
    Member

    gpshead commented Oct 9, 2010

    0001-update-GC-shutdown-patch.patch looks sane to me at first glance. any other opinions?

    @pitrou
    Copy link
    Member

    pitrou commented Oct 9, 2010

    The patch is obviously against 2.x (there are some PyString_Check's on module names, for example). It should be regenerated against 3.x.

    Also, it would be nice if a test could be devised to check that the shutdown procedure works as expected (I'm not sure how such a test would look like).

    @ncoghlan
    Copy link
    Contributor

    In bpo-14035, Florent pointed out the current behaviour potentially causes problems for some uses of import_fresh_modules() in the test suite (with globals sometimes being set to None if there's no indepenent reference to the module).

    GC based module cleanup would avoid that problem automatically.

    @arigo
    Copy link
    Mannequin Author

    arigo mannequin commented Feb 17, 2012

    Fwiw, the behavior in PyPy is: don't do anything particular at shut-down, just shut down and quit the process. No hacking at module globals to replace them with None, but also no guaranteeing that any __del__ method is ever called. We didn't get a particular bug report about this so far, so it seems to work.

    (This is just a report about PyPy's situation; I understand that the situation in CPython is a bit more delicate if CPython is embedded in a larger process.)

    @ncoghlan
    Copy link
    Contributor

    Also, since this issue was last updated, Antoine devised a scheme to test some of the embedding functionality (mainly to test subinterpreters, IIRC). Perhaps that could be harnessed to check GC-based shutdown is working correctly (it might even do it already, without any changes to the test suite).

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Feb 23, 2012

    (This is just a report about PyPy's situation; I understand that the
    situation in CPython is a bit more delicate if CPython is embedded in
    a larger process.)

    I think that would indeed be unacceptable for Python - there is a
    long-standing expectation that we free all memory that we allocated,
    as well as release any other resources that we hold. There are also
    expectations wrt. running atexit code. So there clearly must be a
    shutdown procedure.

    @arigo
    Copy link
    Mannequin Author

    arigo mannequin commented Feb 24, 2012

    Obviously we run atexit code too. There is no point in having atexit if it's not guaranteed to run in a normal shutdown.

    @StefanFriesel
    Copy link
    Mannequin

    StefanFriesel mannequin commented Oct 5, 2012

    What is the status of this? Does the patch need more reviewing?

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Oct 5, 2012

    At the moment, it's like that the status of the patch needs to be reestablished. Does it apply? Does it work? Does the test suite still pass?

    @nascheme
    Copy link
    Member

    nascheme commented Oct 6, 2012

    It's been quite a long time since I played with this patch so my memory might be a bit fuzzy. As I recall, it sounds good in theory but in practice it doesn't really work. One of the core problems is that many extension modules keep references to Python objects in global or static variables. These references keep pretty much everything alive and prevent GC cleanup of modules.

    So, a necessary condition to this working is to get rid of those references and use the new module struct facility introduced by Martin. That would be a huge amount of work but I think should be the long term goal.

    @gpshead gpshead removed their assignment Nov 10, 2012
    @ncoghlan
    Copy link
    Contributor

    ncoghlan commented Jan 5, 2013

    In addition to the problem Neil noted with references in extension modules keeping module objects themselves alive, Antoine recently noted that the other major challenge is the reference cycles between module global dictionaries and their contents. As soon as a module global has both a __del__ method and a reference back to the module globals, the entire cycle becomes uncollectable. I suspect one of the reasons PyPy can cope without the explicit reference breaking step is that their GC is better able to cope with __del__ methods than ours.

    I wonder if a useful interim step might be to make the current explicit reference breaking hack a bit smarter by looking at the reference counts. (Note: some aspects of this idea could be made simpler if modules supported weak references)

    1. Call importlib.invalidate_caches()
    2. Delete the first module from sys.modules that has a reference count of exactly one
    3. Repeat 2 until sys.modules is empty or every remaining module has a reference count greater than 1 (meaning another module has a reference to it one way or another)
    4. Pick the module in sys.modules with the lowest number of references to it, delete it from sys.modules and delete the reference from the module object to its dictionary
    5. Repeat 4 until sys.modules is empty

    Throughout the process, keep an eye on gc.garbage - if we see a module dict show up there, hit it with the "set all globals to None" hammer. (The new callback functionality in 3.3 makes that easier - for example, you could put a sentinel object in the globals of the module being cleared and watching for a dict containing that sentinel object showing up in 'uncollectable' during the stop phase)

    @pitrou
    Copy link
    Member

    pitrou commented Jul 30, 2013

    Superceded by patch in bpo-18214.

    @pitrou pitrou closed this as completed Jul 30, 2013
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 9, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    7 participants