classification
Title: filter() doesn't use __len__ of str/unicode/tuple subclasses
Type: Stage:
Components: Interpreter Core Versions: Python 2.5
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: collinwinter, rhettinger (2)
Priority: normal Keywords

Created on 2006-07-13 19:25 by collinwinter, last changed 2006-07-13 22:43 by rhettinger.

Files
File name Uploaded Description Edit Remove
filter_use_subclasses_len.patch collinwinter, 2006-07-13 19:25 Against r50623
Messages (2)
msg29173 - (view) Author: Collin Winter (collinwinter) Date: 2006-07-13 19:25
Consider the following code:

>>> class badstr(str):
...     def __len__(self):
...         return 6
...     def __getitem__(self, index):
...         return "a"
...
>>> filter(None, badstr("bbb"))
'aaa'

I would have expected the answer to be 'aaaaaa'.

The cause for this is that
Python/bltinmodule.c:filter{string,unicode,tuple} all
call PyString_Size (or the appropriate equivalent),
even if the sequence is a subclass of one of these
types. This bypasses any overloading of __len__ done by
the subclass, as demonstrated above.

If filter() is going to respect the subclass's
__getitem__ overload, it should also respect
overloading of __len__. The attached patch corrects this.
msg29174 - (view) Author: Raymond Hettinger (rhettinger) Date: 2006-07-13 22:43
Logged In: YES 
user_id=80475

This isn't a bug.  You're making assumptions about
undocumented implementation details.  The is no shortage of
cases where the C code makes optimized direct internal calls
that bypass method overrides on subclasses of builtin types.
History
Date User Action Args
2006-07-13 19:25:19collinwintercreate