This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author paul.moore
Recipients eryksun, kmaork, paul.moore, steve.dower, tim.golden, zach.ware
Date 2019-03-24.11:29:15
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <CACac1F9jUfwcn84PhNKD_=Kf0hB=x27uVLsBz=0hp_JdLSNyUg@mail.gmail.com>
In-reply-to <1553387138.08.0.383636178937.issue36305@roundup.psfhosted.org>
Content
> There's no way to split it up as the joining of two pathlib paths because there is no way to represent "c:a" by itself as anything other than a drive-relative path. The name "./c:a" has to be taken as a unit, which is fundamentally different from "c:a". pathlib agrees:
>
>     >>> p1 = Path('./c:a')
>     >>> p1
>     WindowsPath('c:a')
>     >>> [p1.drive, p1.root, p1.parts]
>     ['', '', ('c:a',)]
>
>     >>> p2 = Path('.') / Path('c:a')
>     >>> p2
>     WindowsPath('c:a')
>     >>> [p2.drive, p2.root, p2.parts]
>     ['c:', '', ('c:', 'a')]
>
> Path('./c:a') is correctly parsed as a relative filename (no root and no drive). So, if it helps any, on the PR I wasn't requesting to change how it's parsed. The ambiguity is due to the way pathlib always collapses all "." components. I would like it to retain an initial "." component. That way the string representation will come out correctly as ".\\c:a" as opposed to the drive-relative path "c:a".

Ah, I think I follow now. But I'm not sure what you mean by wanting it
to "retain an initial '.' component" - how would you expect that to
work in practice? p1.parts == ('.', 'c:a')? I suspect that could break
existing code. In 99% of cases an initial ./ *is* semantically
irrelevant, and people expect it to be omitted. Upsetting that
expectation for something so rare, while technically correct, is
something we need to be careful of. Maybe it needs to be retained in a
new attribute in the Path object, which affects the str() conversion,
but not the existing attributes like parts.

> Some Windows API and runtime library functions behave differently depending whether a relative path has a leading "." or ".." component. We're at a disadvantage if we throw this information away.

Yes, I see your point now. Whether the initial string representation
was foo or ./foo is semantic information that we need to retain for
those places where it matters. But my concern is at the other end of
the equation - we need to be careful, having retained that semantic
information, not to have it intrude on existing, working use cases.

> The CreateProcessW case is a generalization of the case that we're used to across various platforms, in which, for the sake of security, the "." entry is excluded from PATH. In this case, the only way to run an executable in the working directory is to reference it explicitly. For example (in Linux):
[...]
> This would work if pathlib kept the initial "." component.

Thanks, this is a really useful example, as it makes it clear that
this is a general issue, not a platform-specific quirk.

> An example where we currently retain information that's not obviously needed is with ".." components. Even Path.absolute() retains ".." components. It's important in POSIX. For example, "spam/../eggs" shouldn't be reduced to "eggs" because "spam" might be a symlink. This doesn't generally matter in Windows, since it normalizes paths in user mode as strings before they're passed to the kernel, but we still don't throw the information away because it could be useful to code that implements POSIX-like behavior.

Yes. Maybe not stripping an initial ./ can be modelled on that example
- the documentation
(https://docs.python.org/3.7/library/pathlib.html#pure-paths) says
"Spurious slashes and single dots are collapsed, but double dots
('..') are not, since this would change the meaning of a path in the
face of symbolic links" - that should be expanded to clarify that an
initial "." is similarly retained because removing it would change the
meaning in the face of  subprocess invocatioons which rely on it to
explicitly allow running a file from the current directory, or Windows
files with streams.

The exact behaviour needs to be clarified, of course:

Path('./a')
Path('./a:b')
Path('.', 'a')
Path('./', 'a')
Path('.', '.', 'a')

... etc. We should have well-defined behaviour for all of these (I'm
not saying it's *hard* to define the behaviour), and tests to ensure
it's followed.

Having said all of this, I'm not at all sure how much it relates to
the original description of this issue, which didn't mention initial
'./' components at all. Is the originally reported behaviour a
*consequence* of not retaining './', or is it a separate problem? If
the latter, then maybe "Pathlib should retain an initial './'" would
be better raised as a separate bpo item (and PR)?
History
Date User Action Args
2019-03-24 11:29:15paul.mooresetrecipients: + paul.moore, tim.golden, zach.ware, eryksun, steve.dower, kmaork
2019-03-24 11:29:15paul.moorelinkissue36305 messages
2019-03-24 11:29:15paul.moorecreate