Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement LOAD_GLOBAL opcode cache #72345

Closed
1st1 opened this issue Sep 14, 2016 · 14 comments
Closed

Implement LOAD_GLOBAL opcode cache #72345

1st1 opened this issue Sep 14, 2016 · 14 comments
Assignees
Labels
3.7 (EOL) end of life interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage

Comments

@1st1
Copy link
Member

1st1 commented Sep 14, 2016

BPO 28158
Nosy @brettcannon, @vstinner, @methane, @serhiy-storchaka, @1st1, @ztane
Files
  • load_globals_cache.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/1st1'
    closed_at = <Date 2019-12-25.13:46:21.432>
    created_at = <Date 2016-09-14.20:48:00.839>
    labels = ['interpreter-core', '3.7', 'performance']
    title = 'Implement LOAD_GLOBAL opcode cache'
    updated_at = <Date 2019-12-25.13:46:21.425>
    user = 'https://github.com/1st1'

    bugs.python.org fields:

    activity = <Date 2019-12-25.13:46:21.425>
    actor = 'methane'
    assignee = 'yselivanov'
    closed = True
    closed_date = <Date 2019-12-25.13:46:21.432>
    closer = 'methane'
    components = ['Interpreter Core']
    creation = <Date 2016-09-14.20:48:00.839>
    creator = 'yselivanov'
    dependencies = []
    files = ['44668']
    hgrepos = []
    issue_num = 28158
    keywords = ['patch']
    message_count = 14.0
    messages = ['276485', '276519', '276520', '276521', '276522', '276645', '276647', '276697', '276703', '283616', '283629', '283662', '283681', '358865']
    nosy_count = 6.0
    nosy_names = ['brett.cannon', 'vstinner', 'methane', 'serhiy.storchaka', 'yselivanov', 'ztane']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue28158'
    versions = ['Python 3.7']

    @1st1
    Copy link
    Member Author

    1st1 commented Sep 14, 2016

    The attached patch implements an opcode cache for LOAD_GLOBAL opcode, making it 2x faster.

    The idea is to use the new co_extra field of code objects & PEP-509 to attach a cache structure pointing directly to the resolved global name. When globals or builtins are updated, the cache is invalidated.

    The patch adds a new file "ceval_cache.h" which provides some infrastructure to ease the pain of implementing new caches for other opcode types. I can strip down all macros from that file if Victor finds it too hard to review.

    Here's a simple script I used to make sure that the cache is working: https://gist.github.com/1st1/a9660aabdcf6b8bc6205b9fe39a82bba

    You can also set OPCACHE_COLLECT_STATS to 1 to get debug stats output for the cache.

    @1st1 1st1 self-assigned this Sep 14, 2016
    @1st1 1st1 added interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage labels Sep 14, 2016
    @ztane
    Copy link
    Mannequin

    ztane mannequin commented Sep 15, 2016

    I just noticed that you're actually testing a builtin instead of something in just module globals. How is the performance with module globals?

    @serhiy-storchaka
    Copy link
    Member

    Does this prolongate the lifetime of cached global objects?

    @serhiy-storchaka serhiy-storchaka added the 3.7 (EOL) end of life label Sep 15, 2016
    @1st1
    Copy link
    Member Author

    1st1 commented Sep 15, 2016

    Does this prolongate the lifetime of cached global objects?

    No. I store borrowed references. The idea is that if the state of globals/builtins dict has changed, we invalidate the cache.

    @1st1
    Copy link
    Member Author

    1st1 commented Sep 15, 2016

    Serhiy, feel free to review the patch. Guido and Ned okayed it to be committed before 3.6b2.

    Currently the patch only optimizes one opcode -- LOAD_GLOBAL, but cveal_cache.h file provides the infrastructure to easily implement more opcode caches. I can simplify that file (remove all macros) if it's too hard to review it.

    @1st1
    Copy link
    Member Author

    1st1 commented Sep 16, 2016

    I'm going to commit this patch tomorrow.

    @vstinner
    Copy link
    Member

    Can you please post results of the performance benchmark suite?

    If you give me more time, I can review the patch.

    @serhiy-storchaka
    Copy link
    Member

    There is too much magic in ceval_cache.h. Since the patch adds caching only for the LOAD_GLOBAL opcode, it would much clearer if write functions only for the LOAD_GLOBAL opcode, without multilayer macros.

    What will happen if there more than 255 LOAD_GLOBALs. Wouldn't 256th LOAD_GLOBAL return incorrect cached value for the the LOAD_GLOBAL (with index 0)?

    @ztane
    Copy link
    Mannequin

    ztane mannequin commented Sep 16, 2016

    I wouldn't actually consider the builtin lookup speed almost at all, in all non-trivial applications the ratio of builtins to any other names will diminish; and if there is a tight loop, it is possible to always speed it up by min = min if you do not need to support monkey patching.

    @methane
    Copy link
    Member

    methane commented Dec 19, 2016

    I'll update this patch and bpo-10401 and then run benchmark suite when I have time.

    As far as I look quickly, bpo-10401 uses namei as cache key
    and this patch uses opcode index as cache key, am I right?

    @1st1
    Copy link
    Member Author

    1st1 commented Dec 19, 2016

    As far as I look quickly, bpo-10401 uses namei as cache key
    and this patch uses opcode index as cache key, am I right?

    Correct. Actually, if you don't mind, I'd like to update the patch myself. I have a few ideas how to restructure it and add support for LOAD_ATTR.

    @methane
    Copy link
    Member

    methane commented Dec 20, 2016

    Actually, if you don't mind, I'd like to update the patch myself. I have a few ideas how to restructure it and add support for LOAD_ATTR.

    Sounds interesting.
    I'll wait to touch this patch. There are enough time by Python 3.7
    and enough other issues I can work.

    @methane
    Copy link
    Member

    methane commented Dec 20, 2016

    memo: http://bugs.python.org/issue26219 may be relating to this.

    @methane
    Copy link
    Member

    methane commented Dec 25, 2019

    This is implemented in bpo-26219.

    @methane methane closed this as completed Dec 25, 2019
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants