Issue 27000: improve document of filter

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/71187

classification

Title:	improve document of filter
Type:		Stage:	resolved
Components:	Documentation	Versions:	Python 3.6, Python 3.5

process

Status:	closed	Resolution:	not a bug
Dependencies:		Superseder:
Assigned To:	docs@python	Nosy List:	docs@python, josh.r, leewz, r.david.murray, rhettinger, xiang.zhang
Priority:	normal	Keywords:	patch

Created on 2016-05-11 16:45 by xiang.zhang, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
filter_doc.patch	xiang.zhang, 2016-05-11 16:45		review

Messages (11)
msg265325 - (view)	Author: Xiang Zhang (xiang.zhang) *	Date: 2016-05-11 16:45
I think filter's doc can be improved[1]: 1. It doesn't mention ``bool``. ``bool`` is treated the same way as ``None``. It is not called. But this is not mentioned. 2. 'the identity function is assumed' is confusing, at least for me. It looks like when ``None`` is passed, function is set to a default func, lambda x: x. Then function is called and we identify the return value True or False. But this is not the truth. There is no default value and no function is applied. I think this should be deleted. [1] https://docs.python.org/3/library/functions.html#filter
msg265327 - (view)	Author: Franklin? Lee (leewz)	Date: 2016-05-11 17:08
Aren't these both implementation details? As in, they only affect efficiency, not effect, right?
msg265328 - (view)	Author: Josh Rosenberg (josh.r) *	Date: 2016-05-11 17:08
bool is not enough of a special case to call it out without confusing the issue. No, the bool constructor is not actually called. But it will still behave as if it was called for all intents and purpose, it just skips the reference counting shenanigans for the actual True/False singleton objects. Drawing a distinction might make people worry that it wouldn't invoke __len__ or __bool__ as normal. Similarly, for all intents and purposes, your mental model of the identity function is mostly correct (I suspect the wording meant to use "function" in the mathematical sense, but it works either way). Yes, it never actually calls a function, but that's irrelevant to observed behavior. Your only mistake is in assuming the function actually returns the specific values True or False; no filter function needs to return True or False, they simply evaluate for truth or falsehood (that's why filter's docs use "true" and "false" to describe it, not "True" and "False"). filter(str.strip, list_of_strings) is perfectly legal, and yields those strings that contain non-whitespace characters.
msg265329 - (view)	Author: Josh Rosenberg (josh.r) *	Date: 2016-05-11 17:10
Franklin said it better: The only difference between documentation and behavior is invisible implementation details, which have no business being documented in any event (since they needlessly tie the hands of maintainers of CPython and other Python interpreters, while providing no useful benefit).
msg265331 - (view)	Author: Xiang Zhang (xiang.zhang) *	Date: 2016-05-11 17:33
First I have to clarify that my mistake is not in understanding but in writing. What I mean by 'identify the return value True or False' is actually what you say, 'evaluate for truth or falsehood'. I also notice the lowercase false and true in the doc. I know they are deliberate. Sorry about this. For ``bool``, I almost agree with you now. Although I still think it's telling readers incorrect info in the second part. For ``bool``, it is not equivalent to ``(item for item in iterable if function(item))`` but ``(item for item in iterable if item)``. For CPython, you are not telling the truth. And for identity function, I insist. I don't see any advantage with this sentence other than confusion. I don't think this will affect other implementation either.
msg265337 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2016-05-11 19:14
> bool is not enough of a special case to call it out without > confusing the issue. I concur. It would be easy to make the docs less usable by elaborating on this special case. For the most part, a user should use None if they just want to test the truth value of the input. The docs for filter() have been through a number of revisions and much discussion. Let's not undo previous efforts. For the most part, these docs have been successful in communicating what filter() does.
msg265345 - (view)	Author: Franklin? Lee (leewz)	Date: 2016-05-11 19:55
> Although I still think it's telling readers incorrect info in the second part. For ``bool``, it is not equivalent to ``(item for item in iterable if function(item))`` but ``(item for item in iterable if item)``. For CPython, you are not telling the truth. What do you mean by, "it is not equivalent"? Are you saying that the first one will give a different result from the second? In general, when interpreting an object in a boolean context, Python will do the "equivalent" of calling ``bool`` on it, where "equivalent" in the docs means "has the same result as". See, for example, the ``itertools`` docs: https://docs.python.org/3/library/itertools.html#itertools.accumulate -------- In this case: If ``filter`` is passed ``None`` or ``bool``, it will call "PyObject_IsTrue" on the object. (https://github.com/python/cpython/blob/c750281ef5d8fa89d13990792163605302e972d4/Python/bltinmodule.c#L481) "PyObject_IsTrue" is defined here: https://github.com/python/cpython/blob/6aea3c26a22c5d7e3ffa3d725d8d75dac0e1b83b/Objects/object.c#L1223 On the other hand, ``bool`` is defined here, as "PyBool_Type": https://github.com/python/cpython/blob/c750281ef5d8fa89d13990792163605302e972d4/Python/bltinmodule.c#L2686 "PyBool_Type" is defined here, with the ``bool.__new__`` function defined as "bool_new": https://github.com/python/cpython/blob/2d264235f6e066611b412f7c2e1603866e0f7f1b/Objects/boolobject.c#L133 "bool_new" is defined here, using "PyObject_IsTrue": https://github.com/python/cpython/blob/2d264235f6e066611b412f7c2e1603866e0f7f1b/Objects/boolobject.c#L43 Both "filter_next" and "bool_new" call "PyObject_IsTrue" and take 0 as False, positive as True, and negative as an error. So it's equivalent to calling ``bool``, but the "bool_new" call is sort of inlined. Does that clear things up?
msg265350 - (view)	Author: R. David Murray (r.david.murray) *	Date: 2016-05-12 00:07
Thanks for wanting to improve the docs, but the docs are a specification of syntax and behavior, not of implementation. So, the existing docs are correct, and changing them would over-specify the function. Since Raymond has also voted for rejection, I'm closing this.
msg265356 - (view)	Author: Xiang Zhang (xiang.zhang) *	Date: 2016-05-12 02:07
It's OK. Thanks for all your info and do learn. BTW, Franklin, I knew what will happen when ``bool`` is passed.
msg265357 - (view)	Author: Franklin? Lee (leewz)	Date: 2016-05-12 02:10
In that case, I'm still wondering what you mean by "not equivalent". Are you saying there is code which will work only if the ``bool`` function is really called?
msg265358 - (view)	Author: Xiang Zhang (xiang.zhang) *	Date: 2016-05-12 02:24
Not about code, just the doc. In my opinion, if ``bool`` is not called it is definitely not equivalent to ``(item for item in iterable if function(item))``, which actually calls the function, even there is nothing different in the result. But, this is a rather subjective and not important now. I am OK with all your opinions. And considering other interpreters, leaving it untouched is a good idea.

History
Date	User	Action	Args
2022-04-11 14:58:30	admin	set	github: 71187
2016-05-12 02:24:00	xiang.zhang	set	messages: + msg265358
2016-05-12 02:10:52	leewz	set	messages: + msg265357
2016-05-12 02:07:53	xiang.zhang	set	messages: + msg265356
2016-05-12 00:07:12	r.david.murray	set	status: open -> closed nosy: + r.david.murray messages: + msg265350 resolution: not a bug stage: resolved
2016-05-11 19:55:27	leewz	set	messages: + msg265345
2016-05-11 19:14:19	rhettinger	set	nosy: + rhettinger messages: + msg265337
2016-05-11 17:33:10	xiang.zhang	set	messages: + msg265331
2016-05-11 17:10:26	josh.r	set	messages: + msg265329
2016-05-11 17:08:31	josh.r	set	nosy: + josh.r messages: + msg265328
2016-05-11 17:08:05	leewz	set	nosy: + leewz messages: + msg265327
2016-05-11 16:45:55	xiang.zhang	create