Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deep-freezed modules create inconsistency in sys.gettotalrefcount() (_Py_Reftotal) #90607

Closed
vstinner opened this issue Jan 21, 2022 · 21 comments
Labels
3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs)

Comments

@vstinner
Copy link
Member

BPO 46449
Nosy @gvanrossum, @vstinner, @tiran, @ericsnowcurrently, @kumaraditya303
PRs
  • bpo-46449: Adjust refcnt for Py_INCREF/Py_DECREF on immortal code-objects #30976
  • bpo-46449: Fix refcount of deepfrozen code #30984
  • bpo-46449: add more assertions to verify that deepfreeze modules are finalized properly #30985
  • bpo-46449: use strong reference in deepfreeze modules  #30987
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2022-01-28.13:02:46.877>
    created_at = <Date 2022-01-21.03:02:06.727>
    labels = ['interpreter-core', '3.11']
    title = 'Deep-freezed modules create inconsistency in sys.gettotalrefcount() (_Py_Reftotal)'
    updated_at = <Date 2022-01-28.13:02:46.876>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2022-01-28.13:02:46.876>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2022-01-28.13:02:46.877>
    closer = 'vstinner'
    components = ['Interpreter Core']
    creation = <Date 2022-01-21.03:02:06.727>
    creator = 'vstinner'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 46449
    keywords = ['patch']
    message_count = 21.0
    messages = ['411075', '411083', '411094', '411095', '411152', '411212', '411308', '411316', '411327', '411401', '411419', '411467', '411468', '411477', '411496', '411868', '411884', '411974', '411975', '411991', '411993']
    nosy_count = 5.0
    nosy_names = ['gvanrossum', 'vstinner', 'christian.heimes', 'eric.snow', 'kumaraditya']
    pr_nums = ['30976', '30984', '30985', '30987']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue46449'
    versions = ['Python 3.11']

    @vstinner
    Copy link
    Member Author

    Using the C program below, I see that _Py_RefTotal is decreasing at each iteration:
    ---

    #include <Python.h>
    int main(int argc, char *argv[])
    {
        for (int i=1; i <= 100; i++) {
            Py_SetProgramName(L"./_testembed");
            Py_Initialize();
            Py_Finalize();
            printf("Loop #%d: %zd refs\n", i, _Py_RefTotal);
        }
    }

    Example of output:
    ---
    ...
    Loop #96: 9557 refs
    Loop #97: 9544 refs
    Loop #98: 9531 refs
    Loop #99: 9518 refs
    Loop #100: 9505 refs
    ---

    It seems to be a regression caused by this change:

    commit 1cbaa50
    Author: Guido van Rossum <guido@python.org>
    Date: Wed Nov 10 18:01:53 2021 -0800

    bpo-45696: Deep-freeze selected modules (GH-29118)
    
    This gains 10% or more in startup time for `python -c pass` on UNIX-ish systems.
    
    The Makefile.pre.in generating code builds on Eric's work for bpo-45020, but the .c file generator is new.
    
    Windows version TBD.
    

    Before the change, _Py_RefTotal was stable:
    ---
    ...
    Loop #97: 10805 refs
    Loop #98: 10805 refs
    Loop #99: 10805 refs
    Loop #100: 10805 refs
    ---

    I found this issue while working on bpo-46417 which is related to bpo-1635741.

    @vstinner vstinner added 3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Jan 21, 2022
    @gvanrossum
    Copy link
    Member

    Hm, the deep-frozen objects are statically initialized with a very large refcount that isn't accounted for (they are intended to be immortal). It seems that Py_Finalize() somehow decrefs those objects. I guess this means we need some kind of flag indicating certain objects are immortal (Eric has proposed several schemes), then we could just mark these objects as immortal.

    This reminds me, since #30715 (which I merged yesterday) the deep-frozen objects also reference the small ints directly, as well as the singleton for b"". Is this even safe across Py_Finalize()/Py_Initialize()? If not, we'll need to roll that back as well.

    @vstinner
    Copy link
    Member Author

    Hm, the deep-frozen objects are statically initialized with a very large refcount that isn't accounted for (they are intended to be immortal). It seems that Py_Finalize() somehow decrefs those objects. I guess this means we need some kind of flag indicating certain objects are immortal (Eric has proposed several schemes), then we could just mark these objects as immortal.

    The problem is only _Py_RefTotal. Maybe frozen_only_do_patchups() should increment _Py_RefTotal when Python it build with Py_DEBUG macro defined, so it can be safely decremented in Py_Finalize()?

    Adding a flag in PyObject/PyTypeObject and modifying Py_DECREF() sounds more controversial. I suggest to do that later ;-) (I am not convinced that it's the best approach.) I would suggest to write a PEP for immortal objects.

    @vstinner
    Copy link
    Member Author

    This reminds me, since #30715 (which I merged yesterday) the deep-frozen objects also reference the small ints directly, as well as the singleton for b"". Is this even safe across Py_Finalize()/Py_Initialize()? If not, we'll need to roll that back as well.

    I don't know.

    @gvanrossum
    Copy link
    Member

    I was hoping @eric.snow could tell me.

    @ericsnowcurrently
    Copy link
    Member

    the deep-frozen objects also reference the small ints directly, as well as the singleton for b"".
    Is this even safe across Py_Finalize()/Py_Initialize()? If not, we'll need to roll that back as well.

    The small ints and the empty bytes object each have "immortal" refcounts too (999999999, just like you did in deepfreeze). So they would cause a similar behavior to what Victor reported. Otherwise I wouldn't expect any problems across Py_Finalize()/Py_Initialize().

    @vstinner
    Copy link
    Member Author

    Is there a way to disable deepfreeze when building Python?

    It makes the Python build way slower. For example, a full build (after "make clean") of Python 3.10 takes 14.9 seconds on my laptop, whereas Python 3.11 takes 24.6 seconds (1.6x slower). It makes my workflow (trial-and-error based ;-)) less efficient.

    Moreover, I would like to disable it to investigate why _Py_RefTotal is now negative at Python exit:
    https://bugs.python.org/issue46417#msg411307

    Note: I pushed many changes in bpo-46417 to clear static types and a few "static" objects at Python exit (in Py_Finalize()).

    @vstinner
    Copy link
    Member Author

    See also bpo-46476: "Not all memory allocated by _Py_Quicken() is released at Python exit".

    @gvanrossum
    Copy link
    Member

    Is there a way to disable deepfreeze when building Python?

    It looks like this isn't easy, sorry. :-( Adding Christian Heimes in case he has a suggestion.

    @tiran
    Copy link
    Member

    tiran commented Jan 23, 2022

    If you modify the freeze_modules script to generate code like https://github.com/python/cpython/compare/main...tiran:split_frozen?expand=1 , then I can add a build option to compile Python with only required frozen modules.

    @gvanrossum
    Copy link
    Member

    @Kumar do you want to tackle this?

    @kumaraditya303
    Copy link
    Contributor

    @Kumar do you want to tackle this?

    I don't like this approach as it is opposite to what we did to reduce the size of deep-frozen modules to merge them in one file and this approach requires to split it again.

    @tiran
    Copy link
    Member

    tiran commented Jan 24, 2022

    Do you have an alternative suggestion how to build Python with minimal set of required deepfrozen modules or without any deepfrozen modules at all? A minimal set is not only helpful for Victor's use case. I would also like to use it in WebAssembly builds to reduce the overall file size.

    @vstinner
    Copy link
    Member Author

    I don't want to change the default. Keeping fast startup time is a nice goal!

    I'm asking to make it configurable for my own development workflow: build Python as fast as possible.

    It is easy to hack Makefile.am.in and Python/frozen.c to freeze less modules. If you want, I can try to work on a patch to make it configurable. Maybe some people want to freeze... *more* modules, rather than less ;-)

    @gvanrossum
    Copy link
    Member

    I tried to make the 'FROZEN' variable in freeze_modules.py empty, but it has a bunch of places where this is unexpected. Maybe someone can fix that?

    @vstinner
    Copy link
    Member Author

    The bpo-46476 added _Py_Deepfreeze_Fini() and _PyStaticCode_Dealloc() functions: commit c7f810b. If we need to ajust _Py_RefTotal manually, *maybe* it can be done there?

    I don't understand well how static/immortal code object lead to negative _Py_RefTotal. For me, Py_INCREF() and Py_DECREF() should still be used on these objects, no?

    @tiran
    Copy link
    Member

    tiran commented Jan 27, 2022

    The problem is in PyImport_ImportFrozenModuleObject -> unmarshal_frozen_code() -> frozen_info.get_code() -> _Py_get_importlib__bootstrap_external_toplevel() call chain.

    PyImport_ImportFrozenModuleObject() expects unmarshal_frozen_code() to return a strong reference to the code object. However a frozen_info struct with a get_code() function returns a borrowed reference from deepfreeze.c's toplevel code object.

    # --- test.c
    #include <Python.h>
    int main(int argc, char *argv[])
    {
        for (int i=1; i <= 100; i++) {
            Py_SetProgramName(L"./_testembed");
            Py_Initialize();
            Py_Finalize();
            printf("Loop #%d: %zd refs, bootstrap refs: %zd\n", i, _Py_RefTotal, Py_REFCNT(_Py_get_importlib__bootstrap_external_toplevel()));
        }
    }
    # 

    $ gcc -IInclude -I. -o test test.c libpython3.11d.a -lm && ./test
    Loop #1: -3 refs, bootstrap refs: 999999998
    Loop #2: -8 refs, bootstrap refs: 999999997
    Loop #3: -13 refs, bootstrap refs: 999999996
    Loop #4: -18 refs, bootstrap refs: 999999995
    Loop #5: -23 refs, bootstrap refs: 999999994
    Loop #6: -28 refs, bootstrap refs: 999999993
    Loop #7: -33 refs, bootstrap refs: 999999992
    Loop #8: -38 refs, bootstrap refs: 999999991
    Loop #9: -43 refs, bootstrap refs: 999999990
    Loop #10: -48 refs, bootstrap refs: 999999989

    After I changed unmarshal_frozen_code() to "return Py_NewRef(code);", the reference count of the frozen bootstrap module stays stable, but total refcount increases over time:

    Loop #1: 10 refs, bootstrap refs: 999999999
    Loop #2: 18 refs, bootstrap refs: 999999999
    Loop #3: 26 refs, bootstrap refs: 999999999
    Loop #4: 34 refs, bootstrap refs: 999999999
    Loop #5: 42 refs, bootstrap refs: 999999999
    Loop #6: 50 refs, bootstrap refs: 999999999
    Loop #7: 58 refs, bootstrap refs: 999999999
    Loop #8: 66 refs, bootstrap refs: 999999999
    Loop #9: 74 refs, bootstrap refs: 999999999

    @kumaraditya303
    Copy link
    Contributor

    I created a PR #30976 which adjusts _Py_RefTotal and refcnt of immortal codeobjects to account for Py_INCREF/Py_DECREF on codeobjects. With that patch refcnt is 8 and increases by 8 with each initialization of Python.

    @kumaraditya303
    Copy link
    Contributor

    Christian's solution seems better to me so I'll close my PR. Christian would you like to create a PR for it ?

    @vstinner
    Copy link
    Member Author

    New changeset 5a9e423 by Kumar Aditya in branch 'main':
    bpo-46449: deepfreeze get_code() now returns strong ref (GH-30987)
    5a9e423

    @vstinner
    Copy link
    Member Author

    With my additional #75171 fix, msg411075 example no longer leaks :-)
    ---
    Loop #1: 2 refs
    Loop #2: 2 refs
    Loop #3: 2 refs
    ...
    Loop #98: 2 refs
    Loop #99: 2 refs
    Loop #100: 2 refs
    ---

    I close the issue.

    Thanks Christian and Kumar for the fix!

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs)
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants