classification
Title: namedtuple should support fully qualified name for more portable pickling
Type: behavior Stage: needs patch
Components: Library (Lib) Versions: Python 3.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: BreamoreBoy, Christian.Tismer, barry, ctismer, dilettant, eli.bendersky, eric.smith, gvanrossum, josh.r, ncoghlan, python-dev, rhettinger, sbt, serhiy.storchaka
Priority: low Keywords: patch

Created on 2013-05-09 03:46 by eli.bendersky, last changed 2016-09-12 07:19 by rhettinger. This issue is now closed.

Files
File name Uploaded Description Edit
nt_module.diff rhettinger, 2013-05-13 08:29 First draft patch to collections review
Repositories containing patches
https://bitbucket.org/ctismer/namelesstuple
Messages (19)
msg188748 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2013-05-09 03:46
[this came up as part of the Enum discussions. Full details in this thread: http://mail.python.org/pipermail/python-dev/2013-May/126076.html]

namedtuple currently uses this code to obtain the __module__ for the class it creates dynamically so that pickling works:

  result.__module__ = _sys._getframe(1).f_globals.get('__name__', '__main__')

This may not work correctly on all Python implementations, for example IronPython.

To support some way to pickle on all implementations, namedtuple should support a fully qualified name for the class: 

  Point = namedtuple('mymodule.Point', ['x', 'y'])

If the name is a qualified dotted name, it will be split and the first part becomes the __module__.
msg188762 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2013-05-09 10:33
> If the name is a qualified dotted name, it will be split and the first 
> part becomes the __module__.

That will not work correctly if the module name has a dot in it.
msg188775 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2013-05-09 12:38
I agree it should work the same as Enum, and I agree it should be possible to supply the module name. But wouldn't it be cleaner as:

Point = namedtuple('Point', 'x y z', module=__name__)

rather than:

Point = namedtuple(__name__ + '.Point', 'x y z')

?
msg188785 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013-05-09 17:50
Agreed with Eric. We're already modifying PEP 435 to do it that way.
msg188835 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2013-05-10 14:01
A question that came up while reviewing the new enum code: "module" or "module_name" for this extra argument? The former is shorter, but the latter is more correct in a way.
msg188836 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013-05-10 14:02
Module, please. The class attribute is also called __module__ after all.

On Friday, May 10, 2013, Eli Bendersky wrote:

>
> Eli Bendersky added the comment:
>
> A question that came up while reviewing the new enum code: "module" or
> "module_name" for this extra argument? The former is shorter, but the
> latter is more correct in a way.
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org <javascript:;>>
> <http://bugs.python.org/issue17941>
> _______________________________________
>
msg188838 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2013-05-10 14:04
On Fri, May 10, 2013 at 7:02 AM, Guido van Rossum <report@bugs.python.org>wrote:

>
> Guido van Rossum added the comment:
>
> Module, please. The class attribute is also called __module__ after all.
>

Makes sense. Thanks.
msg189086 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2013-05-13 01:49
Marking this a low priority.  I want to see how it pans out for Enums before adding a new parameter to the namedtuple API.  Also, I would like to look at the test cases for Enum so I can see how you're writing tests that would fail without this parameter.
msg189167 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013-05-13 19:03
I'd like to propose one slight tweak to the patch.  (Also to enum.py.)

If no module name was passed and _getframe() fails, can you set the __module__ attribute to something that will cause pickling the object to fail?  That would be much better than letting it pickle but then being unable to unpickle it.

(TBH I'm not sure how this should be accomplished, and if it can't, forget that I said it.  But it would be nice.)
msg189168 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2013-05-13 19:41
When pickling a class (or instance of a class) there is already a check 
that the invariant

     getattr(sys.modules[cls.__module__], cls.__name__) == cls

holds.

 >>> import pickle
 >>> class A: pass
...
 >>> A.__module__ = 'nonexistent'
 >>> pickle.dumps(A())
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
_pickle.PicklingError: Can't pickle <class 'nonexistent.A'>: import of 
module 'nonexistent' failed
msg189171 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013-05-13 20:02
Great, forget I said anything then.

LGTM to the patch, feel free to commit (with update to Misc/NEWS please).
msg189177 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2013-05-13 20:56
LGTM too.  Needs test and docs.
msg190977 - (view) Author: Christian Tismer (Christian.Tismer) * (Python committer) Date: 2013-06-11 17:17
I would like to make an additional suggestion.
(and I implemented this yesterday):

Native namedtuple (not a derived class) can be made much simpler to handle
when no module and class machinery is involved at all.

The way I implemented it has no need for sys._getframes, and it does
not store a reference to the class at all.

The advantage of doing so is that this maximizes the compatibility
with ordinary tuples. Ordinary tuples have no pickling issue at all,
and this way the named tuple should behave as well.

My implementation re-creates the namedtuple classes on the fly by a
function in __reduce_ex__. There is no speed penalty for this because
of caching the classes by their unique name and set of field names.

This is IMHO the way things should work:
A namedtuple replaces a builtin type, so it has the same pickling
behavior: nothing needed.

Rationale:
tuples are used everywhere and dynamically. namedtuple should be as
compatible to that as possible. By having to specify  a module etc., this dynamic is partially lost.

Limitation:
When a class is derived from namedtuple, pickling support is no longer
automated. This is compatible with ordinary tuples as well.

Cheers - Chris
msg220444 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2014-06-13 14:34
Just slipped under the radar?
msg220521 - (view) Author: Josh Rosenberg (josh.r) * Date: 2014-06-14 02:02
I was already thinking of the same solution Christian.Tismer proposed before I reached his post. namedtuples seem cleaner if they naturally act as singletons, so (potentially whether or not pickling is involved) declaring a namedtuple with the same name and fields twice returns the same class. Removes the need to deal with module qualified names, and if pickle can be made to support it, the namedtuple class itself could be pickled uniquely in such a way that it could be recreated by someone else who didn't even have a local definition of the namedtuple, the module that defines it, etc.
msg220523 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2014-06-14 02:11
> Just slipped under the radar?

Nope, I'll get to it before long.
msg220568 - (view) Author: Christian Tismer (ctismer) Date: 2014-06-14 17:11
Ok, I almost forgot about this because I thought my idea
was not considered, and wondered if I should keep that code online.

It is still around, and I could put it into an experimental branch
if someone asks for it.

Missing pull-request on python.org.
msg244149 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-05-27 08:48
> That will not work correctly if the module name has a dot in it.

Pickling qualified names with arbitrary number of dots is supported in 3.4 with protocol 4 and in 3.5 with all protocols (backward compatibly).
msg275982 - (view) Author: Roundup Robot (python-dev) Date: 2016-09-12 07:18
New changeset c8851a0ce7ca by Raymond Hettinger in branch 'default':
Issue #17941: Add a *module* parameter to collections.namedtuple()
https://hg.python.org/cpython/rev/c8851a0ce7ca
History
Date User Action Args
2016-09-12 07:19:09rhettingersetstatus: open -> closed
resolution: fixed
versions: + Python 3.6, - Python 3.5
2016-09-12 07:18:42python-devsetnosy: + python-dev
messages: + msg275982
2015-05-27 08:48:21serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg244149
2014-06-15 01:03:46rhettingersetversions: + Python 3.5, - Python 3.4
2014-06-14 17:11:17ctismersetnosy: + ctismer
messages: + msg220568
2014-06-14 02:11:57rhettingersetmessages: + msg220523
2014-06-14 02:02:46josh.rsetnosy: + josh.r
messages: + msg220521
2014-06-13 14:34:41BreamoreBoysetnosy: + BreamoreBoy
messages: + msg220444
2013-06-11 17:17:55Christian.Tismersethgrepos: + hgrepo198

messages: + msg190977
nosy: + Christian.Tismer
2013-06-10 06:32:58dilettantsetnosy: + dilettant
2013-05-13 20:56:54barrysetmessages: + msg189177
2013-05-13 20:02:02gvanrossumsetmessages: + msg189171
2013-05-13 19:41:31sbtsetmessages: + msg189168
2013-05-13 19:03:13gvanrossumsetmessages: + msg189167
2013-05-13 08:29:35rhettingersetfiles: + nt_module.diff
keywords: + patch
2013-05-13 01:49:24rhettingersetpriority: normal -> low

messages: + msg189086
2013-05-12 13:12:23ncoghlanlinkissue17963 dependencies
2013-05-11 02:59:26rhettingersetassignee: rhettinger

nosy: + rhettinger
2013-05-10 14:04:31eli.benderskysetmessages: + msg188838
2013-05-10 14:02:56gvanrossumsetmessages: + msg188836
2013-05-10 14:01:12eli.benderskysetmessages: + msg188835
2013-05-09 17:50:31gvanrossumsetnosy: + gvanrossum
messages: + msg188785
2013-05-09 13:36:44barrysetnosy: + barry
2013-05-09 12:38:03eric.smithsetnosy: + eric.smith
messages: + msg188775
2013-05-09 10:33:59sbtsetnosy: + sbt
messages: + msg188762
2013-05-09 03:46:56eli.benderskycreate