This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: pathlib: Path.match does not work on paths
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.11
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: Nosy List: eric.araujo, eric.smith, nickpapior, ronaldoussoren
Priority: normal Keywords:

Created on 2021-11-24 07:54 by nickpapior, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (12)
msg406910 - (view) Author: Nick Papior (nickpapior) Date: 2021-11-24 07:54
The documentation of Path.match only says it will match a pattern.

But quite often this pattern may be desirable to have as a Path as well.

import pathlib as pl

path = pl.Path("foo/bar")
print(path.match("bar"))
print(path.match(pl.Path("bar")))

However, the last one fails and one has to resort to 

print(path.match(str(pl.Path("bar"))))

which in my opinion is a little misleading.

I couldn't find any other bug/enhancement report of this. Also, this probably also targets later versions.
msg406912 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2021-11-24 08:57
This would definitely be a new feature and not something that can be back ported.

That said, I don't understand why it is desirable to use a Path as the match argument. That argument is a glob pattern (such as "*.py") and not a file name .
msg406916 - (view) Author: Nick Papior (nickpapior) Date: 2021-11-24 10:09
Ok, I see this a feature. :)

As for why it is desirable.

A part of a path is still a path, and matching for something must mean that you are matching a partial path.

Even if you use '*.py' as the pattern this would still make sense as a path:

path = pl.Path("foo/bar")
print(path.match("bar"))
print(path.match(str(pl.Path("bar"))))
print(path.match(str(pl.Path("*"))))

The idea is that *anything* that can match a path _is_ a sub-path by definition, otherwise it can't be matched. So allowing path is just as natural as far as I see it.

As for the above argumentation I think this also holds for Path.glob and Path.rglob where pattern could just as well be a Path.
msg406919 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2021-11-24 11:38
Match doesn't match paths, but basically does a regular expression match on the textual representation  (using glob syntax instead of normal regular expression syntax).

Because of this I don't agree with your idea that anything that can match a path is a sub-path. 

What is your use case for this?
msg406920 - (view) Author: Nick Papior (nickpapior) Date: 2021-11-24 11:48
> Because of this I don't agree with your idea that anything that can match a path is a sub-path. 

Why not? If a match is True, it means that what is matched must be some kind of valid path matching a glob specification. Whether it is a regular expression, or anything else. If one did $(ls pattern) one would list the paths that matches the pattern, and hence a path. Agreed that the pattern itself is not necessarily a fixed/single path, but a shell glob path. Yet, matches will regardless be a path.

As for the use case I want to assert a files path has a parent that matches another directory/filename something like this:


ref_file = Path("hello")
for f in dir.iterdir():
    if f.parent.match(ref_file):
        <do something>

in the real application the match is a bit more complex with nested directories as well as a recursive iterator.

Lastly, you say:
> That said, I don't understand why it is desirable to use a Path as the match argument.

I am on the other side:
I don't understand why it is undesirable to use a Path as the match argument.

:)

A simple

if isinstance(pattern, PurePath):
   pattern = str(pattern)

would suffice. Or possibly str(pattern.expanduser()) for consistency.
msg406922 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2021-11-24 12:24
I'm not sure what your code tries to accomplish, does it check that ``f.parent`` refers to the same location as ``ref_file``? A clearer solution for that would be ``f.parent.resolve() == ref_file.resolve()``. 

----

The argument to match, glob and rglob cannot be Paths because the argument is not a path but a pattern. Those are conceptually different.

What would ``Path("dir/some.py").match(Path("*.py"))`` return?
msg406923 - (view) Author: Nick Papior (nickpapior) Date: 2021-11-24 12:38
It basically checks that some part of the path is the same as some part of a reference path, they need not have the same complete parent which is why the resolve command would negate this comparison always.

------

As for your last example, that will be quite easily handled:

> would ``Path("dir/some.py").match(Path("*.py"))`` return?

str(Path("*.py")) == "*.py"

So no problems here.

It would even allow users for easier combination of patterns

suffix_path = Path("*.py")

if path.match("home" / suffix_path):
   <process this>
elif path.match("other" / suffix_path):
   <process this>


The equivalent code would have been:

suffix_path = "*.py"
if path.match(os.path.join("home", suffix_path):
   <process this>
elif path.match(os.path.join("other", suffix_path):
   <process this>


I think the former does not infer any confusion, nor does it seem to me to introduce anything that contradicts the meaning of match/glob/rglob.
msg406924 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2021-11-24 12:56
I don't think our opinions about this will converge, I'm therefore leaving this discussion.

>> would ``Path("dir/some.py").match(Path("*.py"))`` return?
>
> str(Path("*.py")) == "*.py"
>
> So no problems here.

I do think this is a problem, treating a Path like an pattern feels wrong to me.
msg406931 - (view) Author: Nick Papior (nickpapior) Date: 2021-11-24 14:04
Thanks for the discussion.
msg407087 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2021-11-26 19:43
FWIW I think in the same way as Ronald.

A pattern is not a path, it’s a string expressing rules.
If it matches, the results are paths, but that does not make the pattern a path.
msg407469 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2021-12-01 16:30
I agree with Éric and Ronald.
msg407492 - (view) Author: Nick Papior (nickpapior) Date: 2021-12-01 21:38
Ok, I can accept a no-fix ;)

I'll close this.
History
Date User Action Args
2022-04-11 14:59:52adminsetgithub: 90047
2021-12-01 21:38:29nickpapiorsetstatus: open -> closed
resolution: wont fix
messages: + msg407492

stage: resolved
2021-12-01 16:30:55eric.smithsetnosy: + eric.smith
messages: + msg407469
2021-11-26 19:43:49eric.araujosetnosy: + eric.araujo
messages: + msg407087
2021-11-24 14:04:13nickpapiorsetmessages: + msg406931
2021-11-24 12:56:18ronaldoussorensetmessages: + msg406924
2021-11-24 12:38:57nickpapiorsetmessages: + msg406923
2021-11-24 12:24:12ronaldoussorensetmessages: + msg406922
2021-11-24 11:48:28nickpapiorsetmessages: + msg406920
2021-11-24 11:38:14ronaldoussorensetmessages: + msg406919
2021-11-24 10:09:55nickpapiorsetmessages: + msg406916
2021-11-24 08:57:17ronaldoussorensetnosy: + ronaldoussoren

messages: + msg406912
versions: + Python 3.11, - Python 3.6, Python 3.7, Python 3.8, Python 3.9
2021-11-24 07:54:31nickpapiorcreate