Issue 30413: Add fnmatch.filterfalse function

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/74598

classification

Title:	Add fnmatch.filterfalse function
Type:	enhancement	Stage:	patch review
Components:	Library (Lib)	Versions:	Python 3.7

process

Status:	open	Resolution:
Dependencies:	30415	Superseder:
Assigned To:		Nosy List:	Roee Nizan, brett.cannon, rhettinger, serhiy.storchaka, steven.daprano, vstinner, wolma
Priority:	normal	Keywords:	patch

Created on 2017-05-20 13:19 by steven.daprano, last changed 2022-04-11 14:58 by admin.

Files
File name	Uploaded	Description	Edit
filterfalse.diff	steven.daprano, 2017-05-20 13:19		review
filterfalse.alternate_patch	wolma, 2017-05-20 17:08		review

Messages (17)
msg294029 - (view)	Author: Steven D'Aprano (steven.daprano) *	Date: 2017-05-20 13:19
There has been some discussion on Python-Ideas about adding fnmatch.filter_false to complement filter, for when you want to ignore matching files rather than non-matching ones. https://mail.python.org/pipermail/python-ideas/2017-May/045694.html I don't believe there have been any strong objections, and work-arounds have a considerable performance penalty (according to the OP, I have not confirmed that). Here's a patch which refactors filter and adds filter_false. Sorry I haven't got my head around the brave new world on Github, so I'm going old school here with a diff :-)
msg294031 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2017-05-20 14:31
In general LGTM. I left few minor comments on Rietveld. Needed the versionadded directive, Misc/NEWS and What's New entries. But shouldn't the new function have name "filterfalse" for parallel with itertools.filterfalse? And this would better match the style of the fnmatch module, fnmatchcase doesn't contain an underscore. Tests for fnmatch.filter are too poor. I think we should add more tests (this is other issue).
msg294036 - (view)	Author: Wolfgang Maier (wolma) *	Date: 2017-05-20 17:08
Hi, seems I had the same thoughts as you, Steven. I had started working on a patch independently yesterday, but after making my changes to fnmatch itself, I found I had too many other things to do to address unittests and doc changes to turn this into a real patch - so thank you for spending time on all of this. I just downloaded your patch and merged it with mine because I think my version of fnmatch.py may be simpler and slightly faster (though like you I didn't run performance tests). Feel free to do whatever you like with this alternate version - it's not that there is much new in it to take credit for :) Another thing I noted: fnmatch takes a rather unusual approach to determine whether normcase is NOP or not. It imports posixpath only to see if it is the same as os.path. That looks pretty weird and wasteful to me (especially if you are running this on Windows and are effectively importing posixpath just for fun then). I think it would be better to just check os.name instead (like pathlib does for example). I moved the corresponding check to the top of the module to make this easier to address, should that be of interest. I'm also using the result of the check in fnmatch.fnmatch because I don't see any reason why you wouldn't want to benefit from it there as well.
msg294038 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2017-05-20 17:26
Your patch makes filter() returning normalized names. This may be not what is expected.
msg294039 - (view)	Author: Wolfgang Maier (wolma) *	Date: 2017-05-20 17:31
Does it? I thought it does so only if normalize_case is True. Did I miss something?
msg294041 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2017-05-20 17:34
Yes, it does so only if normalize_case is True. Currently fnmatch.filter() returns non-normalized names.
msg294078 - (view)	Author: Steven D'Aprano (steven.daprano) *	Date: 2017-05-21 06:39
I'm happy for you to change the name to filterfalse.
msg294110 - (view)	Author: Wolfgang Maier (wolma) *	Date: 2017-05-21 19:31
@serhiy: my bad! I just hadn't realized this behavior of the original. With this requirement I cannot see any simpler solution than Steven's. Some other questions though to everyone involved: 1) what do you think about "os.path is posixpath" vs just checking os.name == 'posix' as I suggested earlier? 2) speaking of missing functionality in filter: What do you think of a keyword argument like 'case' to both filter and filterfalse that, when True, would make these functions behave equivalently to [n for n in names if fnmatchcase(n, pattern)] The default would be False, of course. I know this would be inconsistent in terms of argument vs separate functions, but it is easy to explain and learn and without it Windows users of filter/filterfalse would really suffer from the much poorer performance due to the slow normcase call (even slower now with the new fspath protocol) even if they pass in normalized names already. 3) not directly related to this issue, but I came across it in this context: isn't normcase in both posixpath and ntpath doing isinstance(str, bytes) checks that are redundant with os.fspath? Is this oversight or am I missing something again?
msg294111 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2017-05-21 20:39
> 1) what do you think about "os.path is posixpath" vs just checking os.name == 'posix' as I suggested earlier? Currently os.path is posixpath if and only if os.name == 'posix'. But this can be changed in future. I think it is safer to the current check. The posixpath module doesn't contain classes, enums or namedtuples, it's import is fast. And it would be safer to keep the check in the function rather than move it at module level. > What do you think of a keyword argument like 'case' to both filter and filterfalse that, when True, would make these functions behave equivalently to > [n for n in names if fnmatchcase(n, pattern)] This is a different issue. fnmatch.filter() was added for speeding up glob.glob() (see issue409973). I don't know whether there is a good use case for filtercase(). > isn't normcase in both posixpath and ntpath doing isinstance(str, bytes) checks that are redundant with os.fspath? Good catch! It seems to me that they are redundant. Please open a new issue for this.
msg294132 - (view)	Author: Wolfgang Maier (wolma) *	Date: 2017-05-22 08:32
> Good catch! It seems to me that they are redundant. Please open a new issue for this. done: issue 30427
msg294133 - (view)	Author: Wolfgang Maier (wolma) *	Date: 2017-05-22 08:41
Yet another thing I just realized (sorry for being so annoying): With os.normcase calling os.fspath in 3.6+ it is not really a NOP anymore even on posix. As a consequence, you can now do some weird things with fnmatch: in all cases, and only in these, where the module does call normcase you can pass in Path-like objects as the names. This works with fnmatch.fnmatch on all platforms, never for fnmatchcase, and platform-dependently (on Windows, but not on posix platforms) for filter/filterfalse. It's mostly that last case that worries me because it makes it easy to accidentally write code that is not platform-independent. Also note that you can also pass a Path-like object as pat everywhere except in fnmatchcase.
msg294510 - (view)	Author: Brett Cannon (brett.cannon) *	Date: 2017-05-25 19:33
I don't think os.normcase() calling os.fspath() needs to be a worry. Pragmatically it's just getting a str/bytes from an object that knows how to return a str/bytes for a path. IOW it's no different than os.normcase(str(path)) and so embedding the call such that it's not a flat-out no-op shouldn't be a general concern.
msg350056 - (view)	Author: Roee Nizan (Roee Nizan)	Date: 2019-08-21 09:56
Was this abandoned? Is there a reason not to accept this?
msg350062 - (view)	Author: STINNER Victor (vstinner) *	Date: 2019-08-21 10:51
Rather than adding a new function, why not adding a parameter like sort(key=func, reverse=True)? Something like fnmatch.filterfalse(pat, invert=True)? The Unix grep command has a --invert-match option: -v, --invert-match Invert the sense of matching, to select non-matching lines. The Unix find and test commands use "!" to invert a condition. Example: find ! -name "*.py" # list files which doesn't match ".py" test ! -e setup.py # true if setup.py doesn't exist
msg350066 - (view)	Author: Steven D'Aprano (steven.daprano) *	Date: 2019-08-21 11:10
On Wed, Aug 21, 2019 at 10:51:02AM +0000, STINNER Victor wrote: > Rather than adding a new function, why not adding a parameter like > sort(key=func, reverse=True)? Something like fnmatch.filterfalse(pat, > invert=True)? Guido argues that as a general rule of thumb, we should avoid "constant bool parameters" and prefer seperate functions. For example, we have - re.search and re.match, not re.search(start=True) - str.find and str.rfind, not str.find(end=False) If we typically call the function with a bool constant: filter(pat, invert=True) rather than a variable or expression filter(pat, invert=condition or default) that's a hint that the two cases probably should be seperate functions. Martin Fowler agrees: https://martinfowler.com/bliki/FlagArgument.html as does Raymond Chen: https://devblogs.microsoft.com/oldnewthing/20060828-18/?p=29953 (Of course there are cases where it is impractical to avoid bool flags -- if a function would otherwise take four flags, we would need sixteen functions! -- or there may be other reasons why we might go against that design rule.)
msg350090 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2019-08-21 16:01
> Rather than adding a new function, why not adding a parameter Guido, the reason iteratools has a separate filter() and filterfalse() tools is because of your API guidance to prefer separate functions rather than having flags in most cases. Do you still feel that way and does it apply here? Personally, I would want to have a separate function for fnmatch.filter_false() rather than adding complexity to the API for fnmatch.
msg350093 - (view)	Author: Guido van Rossum (gvanrossum) *	Date: 2019-08-21 16:32
Yes, I still feel like this and I find it applies here. (Note that the module already has two variants of fnmatch(), so it's nothing new in this context.)

History
Date	User	Action	Args
2022-04-11 14:58:46	admin	set	github: 74598
2019-08-21 16:32:30	gvanrossum	set	nosy: - gvanrossum
2019-08-21 16:32:16	gvanrossum	set	messages: + msg350093
2019-08-21 16:01:18	rhettinger	set	nosy: + rhettinger, gvanrossum messages: + msg350090
2019-08-21 11:10:49	steven.daprano	set	messages: + msg350066
2019-08-21 10:51:02	vstinner	set	nosy: + vstinner messages: + msg350062
2019-08-21 09:56:34	Roee Nizan	set	nosy: + Roee Nizan messages: + msg350056
2017-05-25 19:33:10	brett.cannon	set	messages: + msg294510
2017-05-25 12:01:36	serhiy.storchaka	set	nosy: + brett.cannon
2017-05-22 08:41:37	wolma	set	messages: + msg294133
2017-05-22 08:32:30	wolma	set	messages: + msg294132
2017-05-21 20:39:22	serhiy.storchaka	set	messages: + msg294111 title: Add fnmatch.filter_false function -> Add fnmatch.filterfalse function
2017-05-21 19:31:54	wolma	set	messages: + msg294110
2017-05-21 06:39:16	steven.daprano	set	messages: + msg294078
2017-05-20 18:41:13	serhiy.storchaka	set	dependencies: + Improve fnmatch testing components: + Library (Lib)
2017-05-20 17:34:34	serhiy.storchaka	set	messages: + msg294041
2017-05-20 17:31:47	wolma	set	messages: + msg294039
2017-05-20 17:26:41	serhiy.storchaka	set	messages: + msg294038
2017-05-20 17:08:40	wolma	set	files: + filterfalse.alternate_patch nosy: + wolma messages: + msg294036
2017-05-20 14:31:54	serhiy.storchaka	set	nosy: + serhiy.storchaka messages: + msg294031 stage: patch review
2017-05-20 13:19:02	steven.daprano	create