Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intern namedtuple field names #70169

Closed
serhiy-storchaka opened this issue Dec 30, 2015 · 9 comments
Closed

Intern namedtuple field names #70169

serhiy-storchaka opened this issue Dec 30, 2015 · 9 comments
Assignees
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@serhiy-storchaka
Copy link
Member

BPO 25981
Nosy @brettcannon, @birkenfeld, @rhettinger, @ncoghlan, @benjaminp, @serhiy-storchaka, @1st1
Superseder
  • bpo-26148: String literals are not interned if in a tuple
  • Files
  • namedtuple_intern_field_names.patch
  • show_all_fieldnames_interned.py
  • alternate_namedtuple_intern.patch: Alternate patch (post-exec)
  • show_all_fieldnames_interned2.py: Show that _fields are unaffected by the first patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/serhiy-storchaka'
    closed_at = <Date 2016-04-26.08:51:57.560>
    created_at = <Date 2015-12-30.20:05:39.009>
    labels = ['type-feature', 'library']
    title = 'Intern namedtuple field names'
    updated_at = <Date 2016-04-26.08:51:57.560>
    user = 'https://github.com/serhiy-storchaka'

    bugs.python.org fields:

    activity = <Date 2016-04-26.08:51:57.560>
    actor = 'serhiy.storchaka'
    assignee = 'serhiy.storchaka'
    closed = True
    closed_date = <Date 2016-04-26.08:51:57.560>
    closer = 'serhiy.storchaka'
    components = ['Library (Lib)']
    creation = <Date 2015-12-30.20:05:39.009>
    creator = 'serhiy.storchaka'
    dependencies = []
    files = ['41452', '41455', '41457', '41458']
    hgrepos = []
    issue_num = 25981
    keywords = ['patch']
    message_count = 9.0
    messages = ['257238', '257244', '257246', '257247', '257258', '257285', '257286', '264235', '264238']
    nosy_count = 7.0
    nosy_names = ['brett.cannon', 'georg.brandl', 'rhettinger', 'ncoghlan', 'benjamin.peterson', 'serhiy.storchaka', 'yselivanov']
    pr_nums = []
    priority = 'normal'
    resolution = 'duplicate'
    stage = 'resolved'
    status = 'closed'
    superseder = '26148'
    type = 'enhancement'
    url = 'https://bugs.python.org/issue25981'
    versions = ['Python 3.6']

    @serhiy-storchaka
    Copy link
    Member Author

    If intern field names in namedtuple, this will speed up the access to them, because names could be compared just by identity in dict lookup. This can also make pickles containing namedtuples more compact.

    @serhiy-storchaka serhiy-storchaka added stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Dec 30, 2015
    @rhettinger
    Copy link
    Contributor

    Doesn't interning happen already as a byproduct of the exec?

    @serhiy-storchaka
    Copy link
    Member Author

    Indeed, keys of __dict__ are interned. But elements of _fields are not.

    >>> A = namedtuple('A', 'abc123 def456')
    >>> sorted(A.__dict__)[-1] == A._fields[-1]
    True
    >>> sorted(A.__dict__)[-1] is A._fields[-1]
    False

    @rhettinger
    Copy link
    Contributor

    I don't see how the proposed patch would affect the result. ISTM that the interning would have to happen after the template substitution and exec. See the attached alternate patch.

    That said, I'm not too concerned with the contents of _fields not being interned. It isn't the performance sensitive part of the code.

    @serhiy-storchaka
    Copy link
    Member Author

    Now, after you pointed out that the keys in the dictionary already interned, I'm not sure anymore that it is needed to intern the _fields items.

    I'm wondering, why short identifier-like string literals in compiled _fields are not interned?

    @rhettinger
    Copy link
    Contributor

    The benefits are tiny, but if the one line patch looks good, we might as well intern the _fields and save a few bytes.

    @serhiy-storchaka
    Copy link
    Member Author

    Interesting, short string literals usually are interned, but they are not interned in tuple literal.

    >>> namespace = {}
    >>> exec('a = ["abc123"]\ndef abc123(): pass', namespace)
    >>> namespace['abc123'].__name__ is namespace['a'][0]
    True
    >>> exec('a = ("abc123",)\ndef abc123(): pass', namespace)
    >>> namespace['abc123'].__name__ is namespace['a'][0]
    False
    >>> namespace['abc123'].__name__ == namespace['a'][0]
    True

    I think it would be better to change the compiler to always intern short string literals. And patching namedtuple will be not needed.

    @rhettinger
    Copy link
    Contributor

    I think it would be better to change the compiler to always
    intern short string literals. And patching namedtuple will
    be not needed.

    Can we close this entry? If you do patch the compiler, a separate tracker item can be opened.

    @serhiy-storchaka
    Copy link
    Member Author

    I already opened separate bpo-26148. Since I sure this is the correct way, I'm closing this issue. I'll reopen it in case of bpo-26148 will be rejected.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants