Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Share global string identifiers in deepfreeze #90868

Closed
kumaraditya303 opened this issue Feb 10, 2022 · 7 comments
Closed

Share global string identifiers in deepfreeze #90868

kumaraditya303 opened this issue Feb 10, 2022 · 7 comments
Labels
3.11 only security fixes performance Performance or resource usage

Comments

@kumaraditya303
Copy link
Contributor

kumaraditya303 commented Feb 10, 2022

BPO 46712
Nosy @gvanrossum, @ericsnowcurrently, @sweeneyde, @kumaraditya303
PRs
  • bpo-46712: Share global string identifiers in deepfreeze #31261
  • bpo-46430: fix error-handling in _Py_Deepfreeze_Init #31596
  • bpo-46712: Let generate_global_objects.py Run on Earlier Python Versions #31637
  • bpo-46712: Do not Regen Deep-Frozen Modules before Generating Global Objects #32061
  • bpo-46712: share more global strings in deepfreeze #32152
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2022-02-25.18:08:29.439>
    created_at = <Date 2022-02-10.14:18:03.087>
    labels = ['3.11', 'performance']
    title = 'Share global string identifiers in deepfreeze'
    updated_at = <Date 2022-03-28.09:14:38.402>
    user = 'https://github.com/kumaraditya303'

    bugs.python.org fields:

    activity = <Date 2022-03-28.09:14:38.402>
    actor = 'kumaraditya'
    assignee = 'none'
    closed = True
    closed_date = <Date 2022-02-25.18:08:29.439>
    closer = 'gvanrossum'
    components = []
    creation = <Date 2022-02-10.14:18:03.087>
    creator = 'kumaraditya'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 46712
    keywords = ['patch']
    message_count = 7.0
    messages = ['413003', '413005', '414028', '414295', '415665', '415811', '415882']
    nosy_count = 4.0
    nosy_names = ['gvanrossum', 'eric.snow', 'Dennis Sweeney', 'kumaraditya']
    pr_nums = ['31261', '31596', '31637', '32061', '32152']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'resource usage'
    url = 'https://bugs.python.org/issue46712'
    versions = ['Python 3.11']

    @kumaraditya303
    Copy link
    Contributor Author

    Since bpo-46541, the global strings are statically allocated so they can now be referenced by deep-frozen modules just like any other singleton. Sharing identifiers with deepfreeze will reduce the duplicated strings hence it would save space.

    See faster-cpython/ideas#218
    See faster-cpython/ideas#230

    @kumaraditya303 kumaraditya303 added 3.11 only security fixes labels Feb 10, 2022
    @kumaraditya303
    Copy link
    Contributor Author

    I have refactored generate_global_objects.py, and now instead of hard-coding every identifier manually, it now scans *.c files extracts the identifiers used in it and then generate the header file. This has multiple advantages:

    • No need to manually add identifiers, as soon as it is used in a c file it is added to the global identifiers struct.
    • It simplifies the codegen a lot.
    • Remove the need of special casing certain file for checking now it is just a set of identifiers and auto removes unused global strings.

    @gvanrossum
    Copy link
    Member

    New changeset eb002db by Kumar Aditya in branch 'main':
    bpo-46712: Share global string identifiers in deepfreeze (GH-31261)
    eb002db

    @gvanrossum gvanrossum added the performance Performance or resource usage label Feb 25, 2022
    @gvanrossum gvanrossum added the performance Performance or resource usage label Feb 25, 2022
    @ericsnowcurrently
    Copy link
    Member

    New changeset 21099fc by Eric Snow in branch 'main':
    bpo-46712: Let generate_global_objects.py Run on Earlier Python Versions (gh-31637)
    21099fc

    @sweeneyde
    Copy link
    Member

    In bpo-47067, there was concern about the addition of the makefile target from PR 31637:

    regen-global-objects: regen-deepfreeze
    

    After a new &_Py_ID(__orig_class__) is added to Objects/genericaliasobject.c, running make regen-global-objects starts

    gcc -pthread -c [snipped] -DPy_BUILD_CORE -o [Objects/genericaliasobject.o](https://github.com/python/cpython/blob/main/Objects/genericaliasobject.o) [Objects/genericaliasobject.c](https://github.com/python/cpython/blob/main/Objects/genericaliasobject.c)
    

    which fails with a compilation error because that identifier is not yet defined. Is there a good way to convince make to regenerate the global objects without this sort of circular dependency? Am I missing a step?

    @ericsnowcurrently
    Copy link
    Member

    After a new &_Py_ID(__orig_class__) is added to Objects/genericaliasobject.c, running make regen-global-objects starts

    gcc -pthread -c [snipped] -DPy_BUILD_CORE -o [Objects/genericaliasobject.o](https://github.com/python/cpython/blob/main/Objects/genericaliasobject.o) [Objects/genericaliasobject.c](https://github.com/python/cpython/blob/main/Objects/genericaliasobject.c)
    

    which fails with a compilation error because that identifier is not yet defined. Is there a good way to convince make to regenerate the global objects without this sort of circular dependency? Am I missing a step?

    I'm looking into this. A temporary workaround is to run Tools/scripts/generate-global-objects.py directly.

    @ericsnowcurrently
    Copy link
    Member

    New changeset febf54b by Eric Snow in branch 'main':
    bpo-46712: Do not Regen Deep-Frozen Modules before Generating Global Objects (gh-32061)
    febf54b

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    ericsnowcurrently added a commit that referenced this issue Nov 8, 2022
    We do the following:
    
    * move the generated _PyUnicode_InitStaticStrings() to its own file
    * move the generated _PyStaticObjects_CheckRefcnt() to its own file
    * include pycore_global_objects.h in extension modules instead of pycore_runtime_init.h
    
    These changes help us avoid including things that aren't needed.
    
    #90868
    vstinner added a commit that referenced this issue Nov 9, 2022
    Add _PyStaticObject_CheckRefcnt() function to make
    _PyStaticObjects_CheckRefcnt() shorter. Use
    _PyObject_ASSERT_FAILED_MSG() to log the object causing the fatal
    error.
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.11 only security fixes performance Performance or resource usage
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants