Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default value identity regression #72129

Closed
kayhayen mannequin opened this issue Sep 2, 2016 · 15 comments
Closed

Default value identity regression #72129

kayhayen mannequin opened this issue Sep 2, 2016 · 15 comments
Assignees
Labels
3.7 (EOL) end of life interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage

Comments

@kayhayen
Copy link
Mannequin

kayhayen mannequin commented Sep 2, 2016

BPO 27942
Nosy @rhettinger, @larryhastings, @benjaminp, @ned-deily, @stevendaprano, @serhiy-storchaka, @1st1, @MojoVampire
Files
  • intern_string_constants.patch
  • refleak.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/serhiy-storchaka'
    closed_at = <Date 2016-11-09.14:44:24.693>
    created_at = <Date 2016-09-02.16:45:37.526>
    labels = ['interpreter-core', '3.7', 'performance']
    title = 'Default value identity regression'
    updated_at = <Date 2017-12-19.08:50:39.586>
    user = 'https://bugs.python.org/KayHayen'

    bugs.python.org fields:

    activity = <Date 2017-12-19.08:50:39.586>
    actor = 'serhiy.storchaka'
    assignee = 'serhiy.storchaka'
    closed = True
    closed_date = <Date 2016-11-09.14:44:24.693>
    closer = 'yselivanov'
    components = ['Interpreter Core']
    creation = <Date 2016-09-02.16:45:37.526>
    creator = 'Kay.Hayen'
    dependencies = []
    files = ['44807', '45407']
    hgrepos = []
    issue_num = 27942
    keywords = ['patch']
    message_count = 15.0
    messages = ['274257', '274262', '274263', '277313', '277323', '277359', '277748', '277749', '280395', '280399', '280400', '280401', '280402', '280403', '280404']
    nosy_count = 11.0
    nosy_names = ['rhettinger', 'larry', 'benjamin.peterson', 'ned.deily', 'steven.daprano', 'kayhayen', 'python-dev', 'serhiy.storchaka', 'yselivanov', 'Kay.Hayen', 'josh.r']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'resource usage'
    url = 'https://bugs.python.org/issue27942'
    versions = ['Python 2.7', 'Python 3.5', 'Python 3.6', 'Python 3.7']

    @kayhayen
    Copy link
    Mannequin Author

    kayhayen mannequin commented Sep 2, 2016

    Consider this:

    def defaultKeepsIdentity(arg = "str_value"):
        print(arg is "str_value")
    
    defaultKeepsIdentity()

    This has been outputing "True" on every Python release I have seen so far, but not so on 3.6.0a4. Normally string values come from a "co_const" and will be "is" identical if used as literals in a module, but no longer in this case.

    This seems wasteful at best, needlessly increasing the number of strings in usage.

    Yours,
    Kay

    @kayhayen kayhayen mannequin added interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage labels Sep 2, 2016
    @stevendaprano
    Copy link
    Member

    Can confirm the expected behaviour (printing True) in Python 2.4 through 2.7, 3.3, Jython 2.5, and even venerable old Python 1.5 (where it prints 1).

    But *not* IronPython 2.6, where it prints False.

    In 3.6, the difference seems to be here:

    py> f = defaultKeepsIdentity
    py> f.__defaults__[0] is f.__code__.co_consts[1]
    False
    py> f.__defaults__[0] == f.__code__.co_consts[1]
    True

    This behaviour is not specified by the language. Caching and re-use of strings has always been subject to change. Nevertheless, perhaps it is time for this to be make a language feature: inside a function, any use of the same string literal should use the same object?

    @serhiy-storchaka
    Copy link
    Member

    This likely is an interference of bpo-27095 with bpo-26148. I consider this as a regression. This causes not only a waste of memory, but can affect a performance, since comparing non-identical strings is slower than comparing identical strings.

    @kayhayen
    Copy link
    Mannequin Author

    kayhayen mannequin commented Sep 24, 2016

    Same with 3.6b1, still present.

    @rhettinger
    Copy link
    Contributor

    It would be nice to get this fixed.

    @serhiy-storchaka
    Copy link
    Member

    Proposed patch interns string constants recursively in tuples and frozensets. This fixes bpo-26148 and bpo-25981.

    @serhiy-storchaka serhiy-storchaka added the 3.7 (EOL) end of life label Sep 25, 2016
    @serhiy-storchaka serhiy-storchaka self-assigned this Sep 29, 2016
    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Sep 30, 2016

    New changeset 78bea78d9335 by Serhiy Storchaka in branch '3.5':
    Issue bpo-27942: String constants now interned recursively in tuples and frozensets.
    https://hg.python.org/cpython/rev/78bea78d9335

    New changeset 0ce63a7651b9 by Serhiy Storchaka in branch '3.6':
    Issue bpo-27942: String constants now interned recursively in tuples and frozensets.
    https://hg.python.org/cpython/rev/0ce63a7651b9

    New changeset 44af6bd21b94 by Serhiy Storchaka in branch 'default':
    Issue bpo-27942: String constants now interned recursively in tuples and frozensets.
    https://hg.python.org/cpython/rev/44af6bd21b94

    New changeset 7cea3bf44acb by Serhiy Storchaka in branch '2.7':
    Issue bpo-27942: String constants now interned recursively in tuples and frozensets.
    https://hg.python.org/cpython/rev/7cea3bf44acb

    @serhiy-storchaka
    Copy link
    Member

    Thanks Naoki and Josh for reviews.

    Backported to 2.7 and 3.5 since the patch also fixes other issues.

    Note that the result of arg is "str_value" is implementation detail. Other Python implementations may not intern string constants. Even CPython interns only strings constants consisting of ASCII alphanumerical characters.

    @1st1
    Copy link
    Member

    1st1 commented Nov 9, 2016

    The patch is causing refleaks:

    test_ast leaked [98, 98, 98] references, sum=294
    test_code leaked [2, 2, 2] references, sum=6
    

    Please review the attached patch fixing that. Please review.

    @1st1 1st1 reopened this Nov 9, 2016
    @1st1
    Copy link
    Member

    1st1 commented Nov 9, 2016

    Looks like the attached patch also fixes

    test_trace leaked [12, 12, 12] references, sum=36

    @serhiy-storchaka
    Copy link
    Member

    Good catch! The patch LGTM, thanks Yury.

    Interesting, what tests in test_ast leak? I expected that this branch is executed in rare circumstances.

    @1st1
    Copy link
    Member

    1st1 commented Nov 9, 2016

    Interesting, what tests in test_ast leak? I expected that this branch is executed in rare circumstances.

    The test that tests it all :) test_stdlib_validates

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Nov 9, 2016

    New changeset 41613bb27f80 by Yury Selivanov in branch '2.7':
    Issue bpo-27942: Fix memory leak in codeobject.c
    https://hg.python.org/cpython/rev/41613bb27f80

    New changeset 2c6825c9ecfd by Yury Selivanov in branch '3.5':
    ssue bpo-27942: Fix memory leak in codeobject.c
    https://hg.python.org/cpython/rev/2c6825c9ecfd

    New changeset b671ac7ae620 by Yury Selivanov in branch '3.6':
    Merge 3.5 (issue bpo-27942)
    https://hg.python.org/cpython/rev/b671ac7ae620

    New changeset c27269c0d619 by Yury Selivanov in branch 'default':
    Merge 3.6 (issue bpo-27942)
    https://hg.python.org/cpython/rev/c27269c0d619

    @1st1
    Copy link
    Member

    1st1 commented Nov 9, 2016

    Thanks for the review, Serhiy!

    @1st1 1st1 closed this as completed Nov 9, 2016
    @serhiy-storchaka
    Copy link
    Member

    The test that tests it all :) test_stdlib_validates

    :)

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants