Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

co_stacksize is calculated from unoptimized code #70736

Closed
ztane mannequin opened this issue Mar 12, 2016 · 7 comments
Closed

co_stacksize is calculated from unoptimized code #70736

ztane mannequin opened this issue Mar 12, 2016 · 7 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage

Comments

@ztane
Copy link
Mannequin

ztane mannequin commented Mar 12, 2016

BPO 26549
Nosy @brettcannon, @birkenfeld, @ncoghlan, @vstinner, @benjaminp, @meadori, @serhiy-storchaka, @1st1, @ztane, @JelleZijlstra
Superseder
  • bpo-29469: AST-level Constant folding
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2017-12-14.08:27:59.483>
    created_at = <Date 2016-03-12.22:45:45.201>
    labels = ['interpreter-core', 'performance']
    title = 'co_stacksize is calculated from unoptimized code'
    updated_at = <Date 2017-12-14.13:30:36.639>
    user = 'https://github.com/ztane'

    bugs.python.org fields:

    activity = <Date 2017-12-14.13:30:36.639>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2017-12-14.08:27:59.483>
    closer = 'serhiy.storchaka'
    components = ['Interpreter Core']
    creation = <Date 2016-03-12.22:45:45.201>
    creator = 'ztane'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 26549
    keywords = []
    message_count = 7.0
    messages = ['261668', '261700', '264088', '265217', '265702', '308281', '308298']
    nosy_count = 10.0
    nosy_names = ['brett.cannon', 'georg.brandl', 'ncoghlan', 'vstinner', 'benjamin.peterson', 'meador.inge', 'serhiy.storchaka', 'yselivanov', 'ztane', 'JelleZijlstra']
    pr_nums = []
    priority = 'normal'
    resolution = 'out of date'
    stage = 'resolved'
    status = 'closed'
    superseder = '29469'
    type = 'performance'
    url = 'https://bugs.python.org/issue26549'
    versions = ['Python 3.6']

    @ztane
    Copy link
    Mannequin Author

    ztane mannequin commented Mar 12, 2016

    When answering a question on StackOverflow, I noticed that a function that only loads a constant tuple to a local variable still has a large co_stacksize as if it was built with BUILD_TUPLE.

    e.g.

        >>> def foo():
        ...     a = (1,2,3,4,5,6,7,8,9,10)
        ...
        >>> foo.__code__.co_stacksize
        10
        >>> dis.dis(foo)
          2           0 LOAD_CONST              11 ((1, 2, 3, 4, 5, 6, 7, 8, 9, 10))
                      3 STORE_FAST               0 (a)
                      6 LOAD_CONST               0 (None)
                      9 RETURN_VALUE

    I suspect it is because in the makecode the stack usage is calculated from the unoptimized assembler output instead of the actual optimized bytecode. I do not know if there is any optimization that would increase the stack usage, but perhaps it should be calculated from the resulting output.

    @ztane ztane mannequin added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Mar 12, 2016
    @brettcannon
    Copy link
    Member

    I also suspect you're right, Antti, that the stack size is calculated prior to the bytecode being passed to through the peepholer which would have made the built tuple a value in the const array.

    Off the top of my head I don't remember where the stack size calculation is made, but my suspicion is it's in the AT -> bytecode step, which would mean making it work from bytecode would mean re-implementing that calculation to work from the bytecode itself (assuming I'm right).

    @JelleZijlstra
    Copy link
    Member

    This also affects co_consts, which includes constants that are no longer used by the optimized code:

    In [8]: def f():
    return (1, 2, 3, 4, 5)
    ...:

    In [9]: f.func_code.co_consts
    Out[9]: (None, 1, 2, 3, 4, 5, (1, 2, 3, 4, 5))

    In [12]: dis.dis(f)
    2 0 LOAD_CONST 6 ((1, 2, 3, 4, 5))
    3 RETURN_VALUE

    @meadori
    Copy link
    Member

    meadori commented May 10, 2016

    Strictly speaking, the stack size is calculated *after* the peephole optimizer is run, but the code that computes the stack depth operates on the basic block graph instead of the assembled and optimized byte code.

    Anyway, the conclusion is the same as Brett noted -- the stack depth analysis would need to be re-written to operate on the optimized bytecode array.

    @meadori
    Copy link
    Member

    meadori commented May 16, 2016

    See also bpo-24340

    @vstinner vstinner added the performance Performance or resource usage label May 18, 2016
    @serhiy-storchaka
    Copy link
    Member

    Seems moving constant folding to the AST level (bpo-29469) have solved this issue.

    >>> def foo():
    ...     a = (1,2,3,4,5,6,7,8,9,10)
    ... 
    >>> foo.__code__.co_stacksize
    1

    @vstinner
    Copy link
    Member

    Seems moving constant folding to the AST level (bpo-29469) have solved this issue.

    Wow, it's cool to see this bug finally fixed!

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants