classification
Title: Intern namedtuple field names
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.6
process
Status: closed Resolution: duplicate
Dependencies: Superseder: String literals are not interned if in a tuple
View: 26148
Assigned To: serhiy.storchaka Nosy List: benjamin.peterson, brett.cannon, georg.brandl, ncoghlan, rhettinger, serhiy.storchaka, yselivanov
Priority: normal Keywords: patch

Created on 2015-12-30 20:05 by serhiy.storchaka, last changed 2016-04-26 08:51 by serhiy.storchaka. This issue is now closed.

Files
File name Uploaded Description Edit
namedtuple_intern_field_names.patch serhiy.storchaka, 2015-12-30 20:05 review
show_all_fieldnames_interned.py rhettinger, 2015-12-31 01:03
alternate_namedtuple_intern.patch rhettinger, 2015-12-31 05:13 Alternate patch (post-exec) review
show_all_fieldnames_interned2.py rhettinger, 2015-12-31 05:14 Show that _fields are unaffected by the first patch
Messages (9)
msg257238 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-12-30 20:05
If intern field names in namedtuple, this will speed up the access to them, because names could be compared just by identity in dict lookup. This can also make pickles containing namedtuples more compact.
msg257244 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2015-12-31 00:32
Doesn't interning happen already as a byproduct of the exec?
msg257246 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-12-31 03:00
Indeed, keys of __dict__ are interned. But elements of _fields are not.

>>> A = namedtuple('A', 'abc123 def456')
>>> sorted(A.__dict__)[-1] == A._fields[-1]
True
>>> sorted(A.__dict__)[-1] is A._fields[-1]
False
msg257247 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2015-12-31 05:13
I don't see how the proposed patch would affect the result.  ISTM that the interning would have to happen after the template substitution and exec.  See the attached alternate patch.

That said, I'm not too concerned with the contents of _fields not being interned.  It isn't the performance sensitive part of the code.
msg257258 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-12-31 17:12
Now, after you pointed out that the keys in the dictionary already interned, I'm not sure anymore that it is needed to intern the _fields items.

I'm wondering, why short identifier-like string literals in compiled _fields are not interned?
msg257285 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2016-01-01 10:47
The benefits are tiny, but if the one line patch looks good, we might as well intern the _fields and save a few bytes.
msg257286 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-01-01 12:23
Interesting, short string literals usually are interned, but they are not interned in tuple literal.

>>> namespace = {}
>>> exec('a = ["abc123"]\ndef abc123(): pass', namespace)
>>> namespace['abc123'].__name__ is namespace['a'][0]
True
>>> exec('a = ("abc123",)\ndef abc123(): pass', namespace)
>>> namespace['abc123'].__name__ is namespace['a'][0]
False
>>> namespace['abc123'].__name__ == namespace['a'][0]
True

I think it would be better to change the compiler to always intern short string literals. And patching namedtuple will be not needed.
msg264235 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2016-04-26 08:30
> I think it would be better to change the compiler to always 
> intern short string literals. And patching namedtuple will 
> be not needed.

Can we close this entry?   If you do patch the compiler, a separate tracker item can be opened.
msg264238 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-04-26 08:51
I already opened separate issue26148. Since I sure this is the correct way, I'm closing this issue. I'll reopen it in case of issue26148 will be rejected.
History
Date User Action Args
2016-04-26 08:51:57serhiy.storchakasetstatus: open -> closed
resolution: duplicate
stage: patch review -> resolved
2016-04-26 08:51:12serhiy.storchakasetsuperseder: String literals are not interned if in a tuple
messages: + msg264238
2016-04-26 08:30:06rhettingersetassignee: rhettinger -> serhiy.storchaka
messages: + msg264235
2016-01-01 12:23:49serhiy.storchakasetnosy: + brett.cannon, georg.brandl, ncoghlan, benjamin.peterson, yselivanov
messages: + msg257286
2016-01-01 10:47:00rhettingersetmessages: + msg257285
2015-12-31 17:12:28serhiy.storchakasetmessages: + msg257258
2015-12-31 05:14:47rhettingersetfiles: + show_all_fieldnames_interned2.py
2015-12-31 05:13:24rhettingersetfiles: + alternate_namedtuple_intern.patch

messages: + msg257247
2015-12-31 03:00:08serhiy.storchakasetmessages: + msg257246
2015-12-31 01:03:50rhettingersetfiles: + show_all_fieldnames_interned.py
2015-12-31 00:32:51rhettingersetmessages: + msg257244
2015-12-30 20:05:39serhiy.storchakacreate