classification
Title: Enhance Object/structseq.c to match namedtuple and tuple api
Type: enhancement Stage: patch review
Components: Extension Modules Versions: Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Arfrever, abarnert, adlaiff6, amaury.forgeotdarc, belopolsky, benjamin.peterson, christian.heimes, eric.araujo, eric.smith, eric.snow, georg.brandl, giampaolo.rodola, gjb1002, jackdied, lemburg, rhettinger, salty-horse, santoso.wijaya, sunfinite, terry.reedy, vstinner
Priority: low Keywords: easy, patch

Created on 2008-01-14 04:20 by christian.heimes, last changed 2016-10-03 16:19 by belopolsky.

Files
File name Uploaded Description Edit
structseq_subclasses_tuple.diff adlaiff6, 2008-01-14 07:03 patch which makes structseq a subclass of tuple review
structseq.diff rhettinger, 2008-01-15 01:21 Tuple subclass patch with tests.
structseq_1.patch sunfinite, 2013-11-07 12:40 Add _fields, _replace, _asdict and eval(repr(s)) review
structseq_fields.patch sunfinite, 2014-05-26 10:17 Add only _fields review
structseq_repr_issue.diff skrah, 2014-06-29 21:13 repr() issue with unnamed fields review
Messages (42)
msg59888 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2008-01-14 04:20
Raymond Hettinger wrote:

Here's a couple more if you want to proceed down that path:

1. Have structseq subclass from PyTupleObject so that isinstance(s,
tuple) returns True.  This makes the object usable whenever 
tuples are needed.

2. Add _fields, _asdict, and _replace to match the API in
collections.namedtuple().  The _fields tuple should only include the 
visible positional fields while _asdict() and _replace() should include
all of the fields whether visible or accessible only by 
attribute access.

3. Change the constructor to accept keyword args so that eval(repr(s))
== s works.

NOTE:
I've marked the task as easy but it's not a task for a total newbie.
It's a feasible yet challenging task for somebody who likes to get into
CPython core programming. Basic C knowledge is required!
msg59891 - (view) Author: Leif Walsh (adlaiff6) Date: 2008-01-14 07:03
Here is a patch for #1.  I ran make test, and nothing was broken that
seemed to be my fault, so I assume it's okay.

Yes, it's small, it's my first one here.  I'll get to the other two
tomorrow.
msg59949 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-01-15 01:21
Thanks for the patch.  I removed the whitespace changes and added some
tests to make sure structseq now works with the % formatting operator
and isinstance(t,tuple).

Am getting a sporadic segfault in test_zipimport when running "make
test", so holding-off on applying:

test_zipimport
~/py26/Lib/test/test_zipimport.py:91: ImportWarning: Not importing
directory '/home/rhettinger/py26/Modules/zlib': missing __init__.py
  ["__dummy__"])
make: *** [test] Segmentation fault
msg59955 - (view) Author: Leif Walsh (adlaiff6) Date: 2008-01-15 02:01
I just svn upped (it updated zipimport) and applied your patch, and
'./python Lib/test/regrtest.py test_zipimport.py' says it's okay, so I
would go ahead and commit it.
msg59956 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-01-15 02:05
Run, "make test" a few times to make sure it doesn't bomb.

The problem may be due to needing a deferred_type instead of assigning
&PyTupleObject directly.  Will look it more later.
msg59957 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-01-15 03:03
It worked fine on a fresh check-out.  Committed in revision 59967.
msg62830 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2008-02-23 22:49
Is there something else to be done for this to be closed?
msg62831 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-02-23 22:54
All three items are still open.  The second one is the easiest.
msg81627 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2009-02-11 04:46
See also issue 2308
msg85097 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2009-04-01 21:16
Jack, do you have any interest in putting this one over the goal line?
msg90461 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2009-07-12 22:48
In Py3.x, this fails:
    "%s.%s.%s-%s-%s" % sys.version_info

The reason is that PyUnicode_Format() expects a real tuple, not a tuple
lookalike.  The fix is to either have structseq inherit from tuple or to
modify PyUnicode_Format() to handle structseq:

   if (PyCheck_StructSeq(args)) {
      newargs = PyTuple_FromSequence(args);
      if (newargs == NULL)
          return NULL;
      result = PyUncode_Format(format, newargs);
      Py_DECREF(newargs);
      return result;
   }
msg90472 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2009-07-13 08:56
Raymond Hettinger wrote:
> Raymond Hettinger <rhettinger@users.sourceforge.net> added the comment:
> 
> In Py3.x, this fails:
>     "%s.%s.%s-%s-%s" % sys.version_info
> 
> The reason is that PyUnicode_Format() expects a real tuple, not a tuple
> lookalike.  The fix is to either have structseq inherit from tuple or to
> modify PyUnicode_Format() to handle structseq:
> 
>    if (PyCheck_StructSeq(args)) {
>       newargs = PyTuple_FromSequence(args);
>       if (newargs == NULL)
>           return NULL;
>       result = PyUncode_Format(format, newargs);
>       Py_DECREF(newargs);
>       return result;
>    }

-1

The special-casing of tuples vs. non-tuples for % is already
bad enough. Adding structseq as another special case doesn't
make that any better.

What's so hard about writing

"%s.%s.%s-%s-%s" % tuple(sys.version_info)

anyway ?
msg90506 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2009-07-13 22:01
ISTM that structseq should have been a tuple subclass anyway.  Isn't it
advertised as some sort of "tuple with attribute access"?
msg90953 - (view) Author: Ori Avtalion (salty-horse) Date: 2009-07-26 18:03
For those who missed it, the patch that was committed in r59967 was
quickly reverted in r59970 with the comment:

"Temporarily revert 59967 until GC can be added."

Raymond, can you please explain what was missing from the patch?
msg104559 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2010-04-29 18:21
See also issue 8413, which would be addressed by this change.
msg104562 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2010-04-29 18:34
I agree that the priority is higher now that we have a demonstrable regression.

Getting structseq to subclass from tuple will take some effort (tuples have many methods that would need to be overriden).
msg109503 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2010-07-07 21:31
structseq now does subclass tuple, so if there's any interest in adding namedtuple APIs, now it should be easier.
msg113451 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2010-08-09 18:50
Would whatever remains of this be deferred by the PEP3003 moratorium?
msg113453 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2010-08-09 18:55
I don't think it would be covered by the moratorium, since it's not a language change. The change to make structseq derive from tuple was not subject to the moratorium, for example.
msg113456 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2010-08-09 19:02
This is definitely not covered by the language moratorium.  Guido has requested this change and it needs to go forward.
msg133351 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2011-04-08 23:02
Issue5907 would benefit of this change.
Unfortunately, structseq constructors already have keyword arguments; they are equivalent to "def __new__(cls, sequence, dict=NULL)".
OTOH these keywords arguments are not documented anywhere.

I suggest to change the constructor to something equivalent to:
"def __new__(cls, sequence=NULL, dict=NULL, *, field1=NULL,  field2=NULL)"
msg133355 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2011-04-08 23:23
Also see issue 11698
msg133364 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2011-04-09 00:53
Hasn't this been fixed in the following changeset?

changeset:   43509:384f73a104e9
user:        Benjamin Peterson <benjamin@python.org>
date:        Wed Jul 07 20:54:01 2010 +0000
summary:     make struct sequences subclass tuple; kill lots of code
msg133367 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2011-04-09 01:02
> Hasn't this been fixed in the following changeset?

It was a major step forward.  Now there needs to be work on other namedtuple methods and whatnot.
msg191872 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2013-06-25 18:09
Would we also want to implement _make().
msg192849 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2013-07-11 07:29
Unassigning until another patch arrives that moves this forward.

The basic idea is well established:  Get the structseq API to match the named tuple API as much as possible.
msg198291 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2013-09-22 16:23
See also #5907.
msg202330 - (view) Author: Sunny K (sunfinite) * Date: 2013-11-07 12:40
New patch for 3.4 adds the following:

1. _fields
2. _replace()
3. _asdict()
4. eval(repr(s)) == s

Now the issues:

1. _asdict() returns a normal dictionary. I don't know if this is what
   is required.

2. Both _asdict() and _replace() assume that unnamed visible fields are
   at the end of the visible sequence.  A comment at the beginning says
   they are allowed for indices < n_visible_fields.
   
   Is there another way to map members to values? Because tp->members
   is (visible named fields + invisible named fields) whereas values
   array is (visible named fields + visible unnamed fields + invisible
   named fields)

3. The mismatch mentioned above is present in the current
   implementation of repr:

   In os.stat_result, the last three visible fields are unnamed
   (i.e integer a_time, m_time and c_time). However they are present 
   in the repr output with keys which are the first three keys from the
   invisible part(float a_time, m_time and c_time).
   Was this intentional?

   Also, the above logic causes duplicate keys when invisible fields 
   are included in the repr output as per issue11629.
  
   In my patch for that issue, i put invisible fields under the 'dict'
   keyword argument. This output format utilises code already present 
   in structseq_new and makes eval work as expected when invisible 
   fields are present in repr. Also, it sidesteps the question of
   duplicated keys because they are present inside a dict.

4. Raymond stated that _fields should contain only the visible 
   positional fields. So _fields of os.stat_result contains only 7
   fields because of the three unnamed fields. Is this the expected 
   implementation?

5. Is there a way to declare a member function in C that accepts only 
   keyword arguments(like _replace())? I could not find one.

This is my first real C patch. So some of the issues might just be a
lack of understanding. Apologies.
msg202333 - (view) Author: Sunny K (sunfinite) * Date: 2013-11-07 12:49
Oops, the correct issue for improving the repr is issue11698.
msg207994 - (view) Author: Andrew Barnert (abarnert) * Date: 2014-01-13 01:31
See issue20230, which includes a patch implementing just the first part of part 2 (adding _fields, but not _asdict or _replace).

Since this one has been open for 5+ years, if it's not going to be done soon, any chance of the easy (_fields) change being committed and putting off the harder parts until later?
msg217301 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2014-04-27 18:14
>> 1. _asdict() returns a normal dictionary. I don't know if this is what
>>    is required.

Good question. I don't think we can import OrderedDict from collections
because of the impact on startup time (_collections_abc was created to
avoid the issue for MutableMapping).
msg217337 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2014-04-28 02:17
> any chance of the easy (_fields) change being committed 
> and putting off the harder parts until later?

Yes, it is not an all-or-nothing exercise.


>> 1. _asdict() returns a normal dictionary. I don't know if this is what
>>    is required.

A regular dict would work just fine for now (there is a patch to introduce an OrderedDict in C, but that transition could be done later.
msg217365 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2014-04-28 11:50
Is accessing _fields a common operation?  Personally I'd use a
PyGetSetDef and generate the tuple on access (perhaps cache the
result).
msg217468 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2014-04-29 05:23
> Is accessing _fields a common operation?

Not common at all -- it is used for introspection.

That said, there is nearly zero savings from generating the tuple upon look-up.  I would say that using a PyGetSetDef is a false optimization.
msg217512 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2014-04-29 11:55
Okay, if no one else wants this, I'll go ahead with the _fields part.

Andrew, could you sign a contributor agreement?
msg219037 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2014-05-24 12:19
Hi Andrew.  Did you by any chance sign the contributor agreement?

[It's perfectly okay if you don't want to, but then we cannot use
the patch from #20230.]
msg219150 - (view) Author: Sunny K (sunfinite) * Date: 2014-05-26 10:17
Hi Stefan,

I've added a new patch which only adds _fields, combining parts from my earlier patch and Andrew's (his patch does not account for visible unnamed fields).
msg219297 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2014-05-28 20:36
Sunny, is there a definition of "visible positional fields"? Currently,
it seems to me that in os.stat_result we have the opposite problem, namely
"visible non-positional" fields:

For example, st_atime_ns is visible but not indexable:

os.stat_result(st_mode=33188, st_ino=524840, st_dev=64513, st_nlink=1, st_uid=0, st_gid=0, st_size=2113, st_atime=1401225421, st_mtime=1398786163, st_ctime=1398786163)
>>> x[9]
1398786163
>>> x.st_atime_ns
1401225421887116783
>>> x[10] 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: tuple index out of range
msg219381 - (view) Author: Sunny K (sunfinite) * Date: 2014-05-30 09:38
Hi Stefan,

There is a comment at the top in structseq.c

/* Fields with this name have only a field index, not a field name.
   They are only allowed for indices < n_visible_fields. */
char *PyStructSequence_UnnamedField = "unnamed field";

This is the definition of os.stat_result in posixmodule.c:

static PyStructSequence_Field stat_result_fields[] = {
    {"st_mode",    "protection bits"},
    {"st_ino",     "inode"},
    {"st_dev",     "device"},
    {"st_nlink",   "number of hard links"},
    {"st_uid",     "user ID of owner"},
    {"st_gid",     "group ID of owner"},
    {"st_size",    "total size, in bytes"},
    /* The NULL is replaced with PyStructSequence_UnnamedField later. */
    {NULL,   "integer time of last access"},
    {NULL,   "integer time of last modification"},
    {NULL,   "integer time of last change"},
    {"st_atime",   "time of last access"},
    {"st_mtime",   "time of last modification"},
    {"st_ctime",   "time of last change"},
    {"st_atime_ns",   "time of last access in nanoseconds"},
    {"st_mtime_ns",   "time of last modification in nanoseconds"},
    {"st_ctime_ns",   "time of last change in nanoseconds"},
      ...


By visible i mean the values present in the repr of the object and by-extension accessible by position.

I talked about my problems with os.stat_result in points 3 and 4 of msg202333 i.e repr of os.stat_result takes the first three keys from the attribute-access only fields (the invisible part) and uses them for the last three values in the displayed structseq.

With my current patch, _fields for os.stat_result only has 7 values:

>>> x = os.stat('.')
>>> x._fields
('st_mode', 'st_ino', 'st_dev', 'st_nlink', 'st_uid', 'st_gid', 'st_size')
>>>

Is this what you expect?
msg220315 - (view) Author: Andrew Barnert (abarnert) * Date: 2014-06-11 22:30
Hi, Stephan. Sorry, for some reason Yahoo was sending updates from the tracker to spam again, so I missed this. I'd be glad to sign a contributor agreement if it's still relevant, but it looks like there's a later patch that does what mine did and more?
msg220316 - (view) Author: Andrew Barnert (abarnert) * Date: 2014-06-11 22:32
Sorry, Stefan, not Stephan. Anyway, I've signed the agreement.
msg221902 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2014-06-29 21:13
Andrew, thanks for signing the agreement!


[Sunny]
> Is this what you expect?

I find the initialization in os.stat_result somewhat strange. Also,
a certain use of unnamed fields that worked in 2.5 is now broken,
which we should sort out before proceeding any further:


Apply structseq_repr_issue.diff, then:

>>> from _testcapi import mk_structseq
>>> s = mk_structseq("unnamed_end")
>>> s[3]
3
>>> tuple(s)
(0, 1, 2, 3, 4)
>>> s
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
SystemError: In structseq_repr(), member 3 name is NULL for type _testcapi.struct_unnamed_end


Perhaps we should just add the missing field names, like namedtuple
does with rename=True:

(a=1, b=2, c=3, _4=4, _5=5)


In any case, in 2.5 the entire tuple was printed as the repr,
regardless of unnamed fields.
History
Date User Action Args
2016-10-03 16:19:08belopolskysetstage: needs patch -> patch review
versions: + Python 3.7, - Python 3.5
2015-07-21 07:57:10ethan.furmansetnosy: - ethan.furman
2014-10-14 16:42:44skrahsetnosy: - skrah
2014-06-29 21:13:18skrahsetfiles: + structseq_repr_issue.diff

messages: + msg221902
title: Enhance Object/structseq.c to match namedtuple and tuple api -> Enhance Object/structseq.c to match namedtuple and tuple api
2014-06-11 22:32:59abarnertsetmessages: + msg220316
2014-06-11 22:30:03abarnertsetmessages: + msg220315
2014-05-30 09:45:09rhettingersetversions: + Python 3.5, - Python 3.4
2014-05-30 09:38:44sunfinitesetmessages: + msg219381
2014-05-28 20:36:09skrahsetmessages: + msg219297
2014-05-26 10:17:35sunfinitesetfiles: + structseq_fields.patch

messages: + msg219150
2014-05-24 12:19:52skrahsetmessages: + msg219037
2014-04-29 11:58:18skrahlinkissue20230 superseder
2014-04-29 11:55:14skrahsetmessages: + msg217512
2014-04-29 05:23:13rhettingersetmessages: + msg217468
2014-04-28 11:50:52skrahsetmessages: + msg217365
2014-04-28 02:17:25rhettingersetmessages: + msg217337
2014-04-27 18:14:46skrahsetmessages: + msg217301
2014-04-27 11:13:30skrahsetnosy: + skrah
2014-01-13 01:32:52ethan.furmansetnosy: + ethan.furman
2014-01-13 01:31:59abarnertsetnosy: + abarnert
messages: + msg207994
2013-11-07 12:49:00sunfinitesetmessages: + msg202333
2013-11-07 12:40:50sunfinitesetfiles: + structseq_1.patch

nosy: + sunfinite
messages: + msg202330

keywords: + patch
2013-09-25 14:21:15Arfreversetnosy: + Arfrever
2013-09-24 21:54:42vstinnersetnosy: + vstinner
2013-09-22 16:23:27belopolskysetmessages: + msg198291
2013-07-11 07:29:11rhettingersetpriority: normal -> low
versions: + Python 3.4, - Python 3.3
messages: + msg192849

assignee: rhettinger ->
components: + Extension Modules, - Interpreter Core
stage: needs patch
2013-06-25 18:09:02eric.snowsetmessages: + msg191872
2012-06-25 01:05:44eric.snowsetnosy: + eric.snow
2011-07-13 15:42:21eric.araujosetnosy: + eric.araujo
2011-04-20 22:17:48rhettingerunlinkissue7796 dependencies
2011-04-16 20:00:37santoso.wijayasetversions: + Python 3.3, - Python 3.2
2011-04-09 01:02:30rhettingersetmessages: + msg133367
2011-04-09 00:53:03belopolskysetnosy: + belopolsky
messages: + msg133364
2011-04-08 23:23:55rhettingersetassignee: jackdied -> rhettinger
messages: + msg133355
2011-04-08 23:02:42amaury.forgeotdarclinkissue5907 dependencies
2011-04-08 23:02:13amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg133351
2011-04-08 19:51:33santoso.wijayasetnosy: + santoso.wijaya
2010-11-28 05:11:16eric.araujolinkissue7796 dependencies
2010-08-24 19:07:54gjb1002setnosy: + gjb1002
2010-08-09 19:02:00rhettingersetmessages: + msg113456
2010-08-09 18:55:12eric.smithsetmessages: + msg113453
2010-08-09 18:50:48terry.reedysetnosy: + terry.reedy

messages: + msg113451
versions: - Python 3.1, Python 2.7
2010-07-07 21:31:36benjamin.petersonsetnosy: + benjamin.peterson
messages: + msg109503
2010-06-02 22:50:09giampaolo.rodolasetnosy: + giampaolo.rodola
2010-04-29 18:34:17rhettingersetpriority: low -> normal

messages: + msg104562
2010-04-29 18:21:47eric.smithsetmessages: + msg104559
2010-04-29 17:27:09eric.smithsetnosy: + eric.smith
2009-07-26 18:03:04salty-horsesetmessages: + msg90953
2009-07-26 10:32:20salty-horsesetnosy: + salty-horse
2009-07-13 22:01:06georg.brandlsetmessages: + msg90506
2009-07-13 08:56:27lemburgsetnosy: + lemburg
title: Enhance Object/structseq.c to match namedtuple and tuple api -> Enhance Object/structseq.c to match namedtuple and tuple api
messages: + msg90472
2009-07-12 22:48:16rhettingersetmessages: + msg90461
versions: + Python 3.2
2009-04-01 21:16:27rhettingersetassignee: rhettinger -> jackdied

messages: + msg85097
nosy: + jackdied
2009-02-11 04:46:19rhettingersetmessages: + msg81627
2009-02-11 04:39:11ajaksu2linkissue2308 superseder
2009-01-02 21:49:34rhettingersetversions: + Python 3.1, Python 2.7, - Python 2.6, Python 3.0
2008-02-23 22:54:01rhettingersetmessages: + msg62831
2008-02-23 22:49:45georg.brandlsetnosy: + georg.brandl
messages: + msg62830
2008-01-15 03:03:57rhettingersetmessages: + msg59957
2008-01-15 02:05:38rhettingersetmessages: + msg59956
2008-01-15 02:01:05adlaiff6setmessages: + msg59955
2008-01-15 01:21:34rhettingersetfiles: + structseq.diff
messages: + msg59949
2008-01-15 00:33:03rhettingersetassignee: rhettinger
2008-01-14 07:03:20adlaiff6setfiles: + structseq_subclasses_tuple.diff
nosy: + adlaiff6
messages: + msg59891
2008-01-14 04:20:27christian.heimescreate