classification
Title: list(obj), tuple(obj) swallow TypeError (in _PyObject_LengthHint)
Type: enhancement Stage: test needed
Components: Interpreter Core Versions: Python 3.3
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: Elvis.Pranskevichus, amaury.forgeotdarc, benjamin.peterson, rhettinger, stutzbach, terry.reedy
Priority: low Keywords:

Created on 2011-03-25 20:38 by Elvis.Pranskevichus, last changed 2011-04-01 23:29 by Elvis.Pranskevichus. This issue is now closed.

Messages (11)
msg132150 - (view) Author: Elvis Pranskevichus (Elvis.Pranskevichus) * (Python triager) Date: 2011-03-25 20:38
Consider the following:

>>> class Test:
...     def __init__(self):
...         self.items = []
...     def __len__(self):
...         if not self.items:
...             self.items = list(self.calc_items())
...         return len(self.items)
...     def __iter__(self):
...         return iter(self.items)
...     def calc_items(self, number):
...         return range(1, number)
... 
>>> l = list(Test())
>>> print(l)
[]
>>> t = tuple(Test())
>>> print(t)
()


In the above example calc_items() method is called with a missing argument, which raises TypeError.  That TypeError is being wrongly interpreted as "object of type 'Test' has no len()" and is swallowed by _PyObject_LengthHint().  

The result is entirely unpredictable as the bug is masked, which is especially annoying for objects that can have a complex call graph under __len__().  Possible solution would be to adjust _PyObject_LengthHint() to rely on some other exception rather than straight TypeError.  Swallowing a generic exception like that is bad.
msg132152 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2011-03-25 20:56
_PyObject_LengthHint() does not define any exception; it simply calls PyObject_Size(), which is expected to raise TypeError for objects with no size (an opened file for example).

Where is the problem exacty? Does the failing __len__ put the object in an invalid state?
msg132157 - (view) Author: Elvis Pranskevichus (Elvis.Pranskevichus) * (Python triager) Date: 2011-03-25 21:21
The problem is that the call to __len__ is implicit.  The Exception is simply swallowed by a list/tuple constructor.

The main issue is that the original error is masked which makes it very hard to pinpoint the actual cause of the failure.  

Here's an example of how it would fail if Test had a rather naive implementation of __getitem__():

>>> class Test:
...     def __init__(self):
...         self.items = []
...     def __len__(self):
...         if not self.items:
...             self.items = list(self.calc_items())
...         return len(self.items)
...     def __iter__(self):
...         return iter(self.items)
...     def calc_items(self, number):
...         return range(1, number)
...     def __getitem__(self, m):
...         return list(self)[m]
... 
>>> t = Test()
>>> print(t[0])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 13, in __getitem__
IndexError: list index out of range


It's not obvious in a trivial synthetic example like that, but in a complex case like ORM abstraction this can cause a lot of pain.

Again, I'm not talking about the __len__ and TypeError protocol, it's the implicit exception swallowing by what is essentially an optional optimization hint.  Calling len(Test()) directly would bring up the correct trace just fine.
msg132169 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2011-03-25 22:23
A certain amount of exception masking is inherent is Python's design.  We use TypeError for a lot of things, including the exception raised by "len(obj)" when obj doesn't have length.

It may be possible to replace the TypeError check with test to set if __len__ is defined, but that would subtly change the semantics.  For example, if a class defined __len__ to raise a TypeError to indicate that the length is unknown (I've seen code like this in the wild being used to distinguish between finite inputs and potentially infinite inputs).

I don't really like the current design of __length_hint__, but it has been around for years and is somewhat set in stone.  If something does get changed, it should only be in Py3.3 so we don't break code that relies on the current behaviors.
msg132175 - (view) Author: Elvis Pranskevichus (Elvis.Pranskevichus) * (Python triager) Date: 2011-03-25 22:45
I think that explicitly raising TypeError in __len__() is wrong in any case.  I would argue something like NotImplementedError is a better choice to signal the absence of length.  Which also makes me think that Python's exceptions are, maybe, too coarse sometimes.
msg132179 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2011-03-25 23:03
> Which also makes me think that Python's exceptions are,
> maybe, too coarse sometimes.

I agree :-)
msg132773 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2011-04-01 22:39
Elvis, I agree that the masking is not nice. To call it a tracker bug (as opposed to design bug), you need to show that the behavior is different from what is documented. Of course, This issue illustrates why one should have unit tests that try to test each component as directly as possible.

>A certain amount of exception masking is inherent is Python's design. 

This issue comes up in other contexts, such as attribute access.

 >We use TypeError for a lot of things, including the exception raised by "len(obj)" when obj doesn't have length.

Using user-level Exceptions internally is convenient, but possibly a small design flaw. Leaving code breakage aside, would it be possible to define private, undocumented inaccessible-from-Python-code internal subclasses such as _TypeError that never show up in Python level tracebacks (absent extension errors)?
msg132774 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2011-04-01 22:49
I have a proposal: only call __length_hint__ on C types.
msg132776 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2011-04-01 23:06
Am going to close this one because I don't see any straight-forward way around it and because it's technically not a bug (just an undesirable design artifact).  The use of TypeError for objects that don't define __len__ is deeply ingrained into Python (20 years).  If something inside a __len__ method raises a TypeError, it would be challenging to distinguish that from a missing __len__ method.  

[terry]
It might be possible to create internal subclasses of TypeError, AttributeError and whatnot, but that is a deep change with unknown implications for users, for other python implementations, for performance, for third-party modules, etc.  It would need a PEP and is likely to get rejected on a cost-benefit basis -- the substantial API churn wouldn't be worth the microscopic benefit (i.e. most people never encounter this or care about it).

[benjamin]
The __length_hint__ protocol is a public API, so anyone can use it.  Also, the issue is a broader than __length_hint__, it is really distinguishing multiple possible meanings for a TypeError raised by a call to __len__.
msg132777 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2011-04-01 23:10
2011/4/1 Raymond Hettinger <report@bugs.python.org>:
> [benjamin]
> The __length_hint__ protocol is a public API, so anyone can use it.  Also, the issue is a broader than __length_hint__, it is really distinguishing multiple possible meanings for a TypeError raised by a call to __len__.

What?? I certainly hope not. I thought it was supposed to be a performance hack.
msg132778 - (view) Author: Elvis Pranskevichus (Elvis.Pranskevichus) * (Python triager) Date: 2011-04-01 23:29
I guess, the best workaround would then be to use a decorator (or a metaclass) and wrap __len__ so that TypeError is caught and wrapped into some other exception.

On April 1, 2011 07:10:21 PM Benjamin Peterson wrote:
> What?? I certainly hope not. I thought it was supposed to be a performance
> hack.

__length_hint__ does indeed look like a hack and not a well-defined API.  It is also undocumented.
History
Date User Action Args
2011-04-01 23:29:12Elvis.Pranskevichussetmessages: + msg132778
2011-04-01 23:10:21benjamin.petersonsetmessages: + msg132777
2011-04-01 23:06:36rhettingersetstatus: open -> closed
resolution: not a bug
messages: + msg132776
2011-04-01 22:49:36benjamin.petersonsetnosy: + benjamin.peterson
messages: + msg132774
2011-04-01 22:39:18terry.reedysetnosy: + terry.reedy
messages: + msg132773

type: behavior -> enhancement
stage: test needed
2011-03-25 23:03:53rhettingersetassignee: rhettinger
messages: + msg132179
2011-03-25 22:47:41Elvis.Pranskevichussetversions: + Python 3.3
2011-03-25 22:47:15Elvis.Pranskevichussetversions: - Python 2.6, Python 2.5, Python 3.1, Python 2.7, Python 3.2, Python 3.3
2011-03-25 22:45:10Elvis.Pranskevichussetmessages: + msg132175
versions: + Python 2.6, Python 2.5, Python 3.1, Python 2.7, Python 3.2
2011-03-25 22:23:39rhettingersetpriority: normal -> low
versions: - Python 2.6, Python 2.5, Python 3.1, Python 2.7, Python 3.2
2011-03-25 22:23:26rhettingersetmessages: + msg132169
2011-03-25 22:05:57stutzbachsetnosy: + stutzbach
2011-03-25 21:25:06pitrousetnosy: + rhettinger
2011-03-25 21:21:33Elvis.Pranskevichussetmessages: + msg132157
2011-03-25 20:56:26amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg132152
2011-03-25 20:38:08Elvis.Pranskevichuscreate