msg294029 - (view) |
Author: Steven D'Aprano (steven.daprano) * |
Date: 2017-05-20 13:19 |
There has been some discussion on Python-Ideas about adding fnmatch.filter_false to complement filter, for when you want to ignore matching files rather than non-matching ones.
https://mail.python.org/pipermail/python-ideas/2017-May/045694.html
I don't believe there have been any strong objections, and work-arounds have a considerable performance penalty (according to the OP, I have not confirmed that).
Here's a patch which refactors filter and adds filter_false.
Sorry I haven't got my head around the brave new world on Github, so I'm going old school here with a diff :-)
|
msg294031 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2017-05-20 14:31 |
In general LGTM. I left few minor comments on Rietveld. Needed the versionadded directive, Misc/NEWS and What's New entries.
But shouldn't the new function have name "filterfalse" for parallel with itertools.filterfalse? And this would better match the style of the fnmatch module, fnmatchcase doesn't contain an underscore.
Tests for fnmatch.filter are too poor. I think we should add more tests (this is other issue).
|
msg294036 - (view) |
Author: Wolfgang Maier (wolma) * |
Date: 2017-05-20 17:08 |
Hi,
seems I had the same thoughts as you, Steven. I had started working on a patch independently yesterday, but after making my changes to fnmatch itself, I found I had too many other things to do to address unittests and doc changes to turn this into a real patch - so thank you for spending time on all of this.
I just downloaded your patch and merged it with mine because I think my version of fnmatch.py may be simpler and slightly faster (though like you I didn't run performance tests). Feel free to do whatever you like with this alternate version - it's not that there is much new in it to take credit for :)
Another thing I noted: fnmatch takes a rather unusual approach to determine whether normcase is NOP or not. It imports posixpath only to see if it is the same as os.path. That looks pretty weird and wasteful to me (especially if you are running this on Windows and are effectively importing posixpath just for fun then). I think it would be better to just check os.name instead (like pathlib does for example). I moved the corresponding check to the top of the module to make this easier to address, should that be of interest. I'm also using the result of the check in fnmatch.fnmatch because I don't see any reason why you wouldn't want to benefit from it there as well.
|
msg294038 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2017-05-20 17:26 |
Your patch makes filter() returning normalized names. This may be not what is expected.
|
msg294039 - (view) |
Author: Wolfgang Maier (wolma) * |
Date: 2017-05-20 17:31 |
Does it? I thought it does so only if normalize_case is True.
Did I miss something?
|
msg294041 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2017-05-20 17:34 |
Yes, it does so only if normalize_case is True. Currently fnmatch.filter() returns non-normalized names.
|
msg294078 - (view) |
Author: Steven D'Aprano (steven.daprano) * |
Date: 2017-05-21 06:39 |
I'm happy for you to change the name to filterfalse.
|
msg294110 - (view) |
Author: Wolfgang Maier (wolma) * |
Date: 2017-05-21 19:31 |
@serhiy: my bad! I just hadn't realized this behavior of the original.
With this requirement I cannot see any simpler solution than Steven's.
Some other questions though to everyone involved:
1) what do you think about "os.path is posixpath" vs just checking os.name == 'posix' as I suggested earlier?
2) speaking of missing functionality in filter:
What do you think of a keyword argument like 'case' to both filter and filterfalse that, when True, would make these functions behave equivalently to
[n for n in names if fnmatchcase(n, pattern)]
The default would be False, of course. I know this would be inconsistent in terms of argument vs separate functions, but it is easy to explain and learn and without it Windows users of filter/filterfalse would really suffer from the much poorer performance due to the slow normcase call (even slower now with the new fspath protocol) even if they pass in normalized names already.
3) not directly related to this issue, but I came across it in this context:
isn't normcase in both posixpath and ntpath doing isinstance(str, bytes) checks that are redundant with os.fspath? Is this oversight or am I missing something again?
|
msg294111 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2017-05-21 20:39 |
> 1) what do you think about "os.path is posixpath" vs just checking os.name == 'posix' as I suggested earlier?
Currently os.path is posixpath if and only if os.name == 'posix'. But this can be changed in future. I think it is safer to the current check. The posixpath module doesn't contain classes, enums or namedtuples, it's import is fast.
And it would be safer to keep the check in the function rather than move it at module level.
> What do you think of a keyword argument like 'case' to both filter and filterfalse that, when True, would make these functions behave equivalently to
> [n for n in names if fnmatchcase(n, pattern)]
This is a different issue. fnmatch.filter() was added for speeding up glob.glob() (see issue409973). I don't know whether there is a good use case for filtercase().
> isn't normcase in both posixpath and ntpath doing isinstance(str, bytes) checks that are redundant with os.fspath?
Good catch! It seems to me that they are redundant. Please open a new issue for this.
|
msg294132 - (view) |
Author: Wolfgang Maier (wolma) * |
Date: 2017-05-22 08:32 |
> Good catch! It seems to me that they are redundant. Please open a new issue for this.
done: issue 30427
|
msg294133 - (view) |
Author: Wolfgang Maier (wolma) * |
Date: 2017-05-22 08:41 |
Yet another thing I just realized (sorry for being so annoying):
With os.normcase calling os.fspath in 3.6+ it is not really a NOP anymore even on posix. As a consequence, you can now do some weird things with fnmatch: in all cases, and only in these, where the module *does* call normcase you can pass in Path-like objects as the names.
This works with fnmatch.fnmatch on all platforms, never for fnmatchcase, and platform-dependently (on Windows, but not on posix platforms) for filter/filterfalse.
It's mostly that last case that worries me because it makes it easy to accidentally write code that is not platform-independent.
Also note that you can also pass a Path-like object as pat everywhere except in fnmatchcase.
|
msg294510 - (view) |
Author: Brett Cannon (brett.cannon) * |
Date: 2017-05-25 19:33 |
I don't think os.normcase() calling os.fspath() needs to be a worry. Pragmatically it's just getting a str/bytes from an object that knows how to return a str/bytes for a path. IOW it's no different than os.normcase(str(path)) and so embedding the call such that it's not a flat-out no-op shouldn't be a general concern.
|
msg350056 - (view) |
Author: Roee Nizan (Roee Nizan) |
Date: 2019-08-21 09:56 |
Was this abandoned? Is there a reason not to accept this?
|
msg350062 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2019-08-21 10:51 |
Rather than adding a new function, why not adding a parameter like sort(key=func, reverse=True)? Something like fnmatch.filterfalse(pat, invert=True)?
The Unix grep command has a --invert-match option:
-v, --invert-match
Invert the sense of matching, to select non-matching lines.
The Unix find and test commands use "!" to invert a condition. Example:
find ! -name "*.py" # list files which doesn't match ".py"
test ! -e setup.py # true if setup.py doesn't exist
|
msg350066 - (view) |
Author: Steven D'Aprano (steven.daprano) * |
Date: 2019-08-21 11:10 |
On Wed, Aug 21, 2019 at 10:51:02AM +0000, STINNER Victor wrote:
> Rather than adding a new function, why not adding a parameter like
> sort(key=func, reverse=True)? Something like fnmatch.filterfalse(pat,
> invert=True)?
Guido argues that as a general rule of thumb, we should avoid "constant
bool parameters" and prefer seperate functions. For example, we have
- re.search and re.match, not re.search(start=True)
- str.find and str.rfind, not str.find(end=False)
If we typically call the function with a bool constant:
filter(pat, invert=True)
rather than a variable or expression
filter(pat, invert=condition or default)
that's a hint that the two cases probably should be seperate functions.
Martin Fowler agrees:
https://martinfowler.com/bliki/FlagArgument.html
as does Raymond Chen:
https://devblogs.microsoft.com/oldnewthing/20060828-18/?p=29953
(Of course there are cases where it is impractical to avoid bool flags
-- if a function would otherwise take four flags, we would need sixteen
functions! -- or there may be other reasons why we might go against that
design rule.)
|
msg350090 - (view) |
Author: Raymond Hettinger (rhettinger) * |
Date: 2019-08-21 16:01 |
> Rather than adding a new function, why not adding a parameter
Guido, the reason iteratools has a separate filter() and filterfalse() tools is because of your API guidance to prefer separate functions rather than having flags in most cases.
Do you still feel that way and does it apply here? Personally, I would want to have a separate function for fnmatch.filter_false() rather than adding complexity to the API for fnmatch.
|
msg350093 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2019-08-21 16:32 |
Yes, I still feel like this and I find it applies here. (Note that the module already has two variants of fnmatch(), so it's nothing new in this context.)
|
|
Date |
User |
Action |
Args |
2022-04-11 14:58:46 | admin | set | github: 74598 |
2019-08-21 16:32:30 | gvanrossum | set | nosy:
- gvanrossum
|
2019-08-21 16:32:16 | gvanrossum | set | messages:
+ msg350093 |
2019-08-21 16:01:18 | rhettinger | set | nosy:
+ rhettinger, gvanrossum messages:
+ msg350090
|
2019-08-21 11:10:49 | steven.daprano | set | messages:
+ msg350066 |
2019-08-21 10:51:02 | vstinner | set | nosy:
+ vstinner messages:
+ msg350062
|
2019-08-21 09:56:34 | Roee Nizan | set | nosy:
+ Roee Nizan messages:
+ msg350056
|
2017-05-25 19:33:10 | brett.cannon | set | messages:
+ msg294510 |
2017-05-25 12:01:36 | serhiy.storchaka | set | nosy:
+ brett.cannon
|
2017-05-22 08:41:37 | wolma | set | messages:
+ msg294133 |
2017-05-22 08:32:30 | wolma | set | messages:
+ msg294132 |
2017-05-21 20:39:22 | serhiy.storchaka | set | messages:
+ msg294111 title: Add fnmatch.filter_false function -> Add fnmatch.filterfalse function |
2017-05-21 19:31:54 | wolma | set | messages:
+ msg294110 |
2017-05-21 06:39:16 | steven.daprano | set | messages:
+ msg294078 |
2017-05-20 18:41:13 | serhiy.storchaka | set | dependencies:
+ Improve fnmatch testing components:
+ Library (Lib) |
2017-05-20 17:34:34 | serhiy.storchaka | set | messages:
+ msg294041 |
2017-05-20 17:31:47 | wolma | set | messages:
+ msg294039 |
2017-05-20 17:26:41 | serhiy.storchaka | set | messages:
+ msg294038 |
2017-05-20 17:08:40 | wolma | set | files:
+ filterfalse.alternate_patch nosy:
+ wolma messages:
+ msg294036
|
2017-05-20 14:31:54 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka
messages:
+ msg294031 stage: patch review |
2017-05-20 13:19:02 | steven.daprano | create | |