This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: improve document of filter
Type: Stage: resolved
Components: Documentation Versions: Python 3.6, Python 3.5
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: docs@python Nosy List: docs@python, josh.r, leewz, r.david.murray, rhettinger, xiang.zhang
Priority: normal Keywords: patch

Created on 2016-05-11 16:45 by xiang.zhang, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
filter_doc.patch xiang.zhang, 2016-05-11 16:45 review
Messages (11)
msg265325 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-05-11 16:45
I think filter's doc can be improved[1]:

1. It doesn't mention ``bool``. ``bool`` is treated the same way as ``None``. It is not called. But this is not mentioned.
2. 'the identity function is assumed' is confusing, at least for me. It looks like when ``None`` is passed, *function* is set to a default func, lambda x: x. Then *function* is called and we identify the return value True or False. But this is not the truth. There is no default value and no function is applied. I think this should be deleted.

[1] https://docs.python.org/3/library/functions.html#filter
msg265327 - (view) Author: Franklin? Lee (leewz) Date: 2016-05-11 17:08
Aren't these both implementation details? As in, they only affect efficiency, not effect, right?
msg265328 - (view) Author: Josh Rosenberg (josh.r) * (Python triager) Date: 2016-05-11 17:08
bool is not enough of a special case to call it out without confusing the issue. No, the bool constructor is not actually called. But it will still behave as if it was called for all intents and purpose, it just skips the reference counting shenanigans for the actual True/False singleton objects. Drawing a distinction might make people worry that it wouldn't invoke __len__ or __bool__ as normal.

Similarly, for all intents and purposes, your mental model of the identity function is mostly correct (I suspect the wording meant to use "function" in the mathematical sense, but it works either way). Yes, it never actually calls a function, but that's irrelevant to observed behavior. Your only mistake is in assuming the function actually returns the specific values True or False; no filter function needs to return True or False, they simply evaluate for truth or falsehood (that's why filter's docs use "true" and "false" to describe it, not "True" and "False"). filter(str.strip, list_of_strings) is perfectly legal, and yields those strings that contain non-whitespace characters.
msg265329 - (view) Author: Josh Rosenberg (josh.r) * (Python triager) Date: 2016-05-11 17:10
Franklin said it better: The only difference between documentation and behavior is invisible implementation details, which have no business being documented in any event (since they needlessly tie the hands of maintainers of CPython and other Python interpreters, while providing no useful benefit).
msg265331 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-05-11 17:33
First I have to clarify that my mistake is not in understanding but in writing. What I mean by 'identify the return value True or False' is actually what you say, 'evaluate for truth or falsehood'. I also notice the lowercase false and true in the doc. I know they are deliberate. Sorry about this.

For ``bool``, I almost agree with you now. Although I still think it's telling readers incorrect info in the second part. For ``bool``, it is not equivalent to ``(item for item in iterable if function(item))`` but ``(item for item in iterable if item)``. For CPython, you are not telling the truth.

And for identity function, I insist. I don't see any advantage with this sentence other than confusion. I don't think this will affect other implementation either.
msg265337 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2016-05-11 19:14
> bool is not enough of a special case to call it out without 
> confusing the issue.

I concur.  It would be easy to make the docs less usable by elaborating on this special case.  For the most part, a user should use None if they just want to test the truth value of the input.

The docs for filter() have been through a number of revisions and much discussion.  Let's not undo previous efforts.  For the most part, these docs have been successful in communicating what filter() does.
msg265345 - (view) Author: Franklin? Lee (leewz) Date: 2016-05-11 19:55
> Although I still think it's telling readers incorrect info in the second part. For ``bool``, it is not equivalent to ``(item for item in iterable if function(item))`` but ``(item for item in iterable if item)``. For CPython, you are not telling the truth.

What do you mean by, "it is not equivalent"? Are you saying that the first one will give a different result from the second? In general, when interpreting an object in a boolean context, Python will do the "equivalent" of calling ``bool`` on it, where "equivalent" in the docs means "has the same result as". See, for example, the ``itertools`` docs:
https://docs.python.org/3/library/itertools.html#itertools.accumulate

--------

In this case:

If ``filter`` is passed ``None`` or ``bool``, it will call "PyObject_IsTrue" on the object.
    (https://github.com/python/cpython/blob/c750281ef5d8fa89d13990792163605302e972d4/Python/bltinmodule.c#L481)

"PyObject_IsTrue" is defined here:
    https://github.com/python/cpython/blob/6aea3c26a22c5d7e3ffa3d725d8d75dac0e1b83b/Objects/object.c#L1223

On the other hand, ``bool`` is defined here, as "PyBool_Type":
    https://github.com/python/cpython/blob/c750281ef5d8fa89d13990792163605302e972d4/Python/bltinmodule.c#L2686


"PyBool_Type" is defined here, with the ``bool.__new__`` function defined as "bool_new":
    https://github.com/python/cpython/blob/2d264235f6e066611b412f7c2e1603866e0f7f1b/Objects/boolobject.c#L133

"bool_new" is defined here, using "PyObject_IsTrue":
    https://github.com/python/cpython/blob/2d264235f6e066611b412f7c2e1603866e0f7f1b/Objects/boolobject.c#L43

Both "filter_next" and "bool_new" call "PyObject_IsTrue" and take 0 as False, positive as True, and negative as an error. So it's equivalent to calling ``bool``, but the "bool_new" call is sort of inlined.

Does that clear things up?
msg265350 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2016-05-12 00:07
Thanks for wanting to improve the docs, but the docs are a specification of syntax and behavior, not of implementation.  So, the existing docs are correct, and changing them would over-specify the function.  Since Raymond has also voted for rejection, I'm closing this.
msg265356 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-05-12 02:07
It's OK. Thanks for all your info and do learn. BTW, Franklin, I knew what will happen when ``bool`` is passed.
msg265357 - (view) Author: Franklin? Lee (leewz) Date: 2016-05-12 02:10
In that case, I'm still wondering what you mean by "not equivalent". Are you saying there is code which will work only if the ``bool`` function is really called?
msg265358 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-05-12 02:24
Not about code, just the doc. In my opinion, if ``bool`` is not called it is definitely not equivalent to ``(item for item in iterable if function(item))``, which actually calls the function, even there is nothing different in the result. But, this is a rather subjective and not important now. I am OK with all your opinions. And considering other interpreters, leaving it untouched is a good idea.
History
Date User Action Args
2022-04-11 14:58:30adminsetgithub: 71187
2016-05-12 02:24:00xiang.zhangsetmessages: + msg265358
2016-05-12 02:10:52leewzsetmessages: + msg265357
2016-05-12 02:07:53xiang.zhangsetmessages: + msg265356
2016-05-12 00:07:12r.david.murraysetstatus: open -> closed

nosy: + r.david.murray
messages: + msg265350

resolution: not a bug
stage: resolved
2016-05-11 19:55:27leewzsetmessages: + msg265345
2016-05-11 19:14:19rhettingersetnosy: + rhettinger
messages: + msg265337
2016-05-11 17:33:10xiang.zhangsetmessages: + msg265331
2016-05-11 17:10:26josh.rsetmessages: + msg265329
2016-05-11 17:08:31josh.rsetnosy: + josh.r
messages: + msg265328
2016-05-11 17:08:05leewzsetnosy: + leewz
messages: + msg265327
2016-05-11 16:45:55xiang.zhangcreate