classification
Title: pathlib.PurePath.parents rejects negative indexes
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.10
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: akira, barry, ju-sh, maxballenger, mdk, p-ganssle, pitrou, r.david.murray, serhiy.storchaka, thejcannon, victorg, ypank
Priority: normal Keywords: patch

Created on 2014-03-23 22:16 by akira, last changed 2020-11-23 20:06 by p-ganssle. This issue is now closed.

Files
File name Uploaded Description Edit
pathlib-parents-allow-negative-index.patch akira, 2014-03-23 22:16 the fix and tests review
allowNegativeIndexParents.patch victorg, 2020-03-04 08:13 the fix w/o tests
Pull Requests
URL Status Linked Edit
PR 21799 merged ypank, 2020-08-10 07:25
Messages (25)
msg214642 - (view) Author: Akira Li (akira) * Date: 2014-03-23 22:16
`pathlib.PurePath.parents` is a sequence [1] but it rejects negative indexes:

  >>> from pathlib import PurePath
  >>> PurePath('a/b/c').parents[-2]
  Traceback (most recent call last):
  ...
  IndexError: -2

Sequences in Python interpret negative indexes as `len(seq) + i` [2]

I've included the patch that fixes the issue and adds corresponding tests. No documentation changes are needed.

[1]: http://docs.python.org/3/library/pathlib#pathlib.PurePath.parents
[2]: http://docs.python.org/3/library/stdtypes.html#sequence-types-list-tuple-range
msg214709 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014-03-24 18:55
I think this is a doc bug.  That object shouldn't be called a sequence, since it isn't one.
msg214716 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-03-24 19:50
Well, it is a sequence, it's just that it doesn't respect the convention about negative indices :-)

As to why they are disallowed, I don't remember exactly (!) but I think it's because the exact semantics would be confusing otherwise.
msg214717 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014-03-24 19:57
Which is exactly what I mean by saying it is not a sequence.  It is 'sequence-like'.  Kind of like email Messages are dict-like: they share many methods and behaviors, but the exact behaviors and semantics are different.
msg215746 - (view) Author: Akira Li (akira) * Date: 2014-04-08 09:05
From https://docs.python.org/3/glossary.html#term-sequence

> An iterable which supports efficient element access using integer indices via the __getitem__() special method and defines a __len__() method that returns the length of the sequence.

.parents *is* a sequence. And it *is* confusing that it doesn't accept negative indexes -- that is how I've encountered the bug.

Antoine, could you elaborate on what are the negative consequences of negative indexes to justify breaking the expectations?
msg223048 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2014-07-14 18:55
Aren't negative indexes well defined in Python?  E.g.

>>> p = Path('/tmp/tmp123/foo/bar/baz.xz')
>>> p.parents[len(p.parents)-2]
PosixPath('/tmp')

p.parents[-2] should == p.parents[len(p.parents)-2]
msg223054 - (view) Author: Akira Li (akira) * Date: 2014-07-14 20:16
> Aren't negative indexes well defined in Python?  

yes. I've provided the link to Python docs [1] in msg214642 that 
explicitly defines the behavior:

> If i or j is negative, the index is relative to the end of the string: 
> len(s) + i or len(s) + j is substituted. But note that -0 is still 0.

[1]: https://docs.python.org/3/library/stdtypes.html#sequence-types-list-tuple-range
msg223059 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2014-07-14 20:59
#7951 has an interesting debate on negative indexes that is possibly applicable here.
msg225503 - (view) Author: Akira Li (akira) * Date: 2014-08-18 19:17
> #7951 has an interesting debate on negative indexes that is possibly applicable here.

Mark could you point to a message that explains why p.parents[-2] is worse
than p.parents[len(p.parents)-2]?
msg332008 - (view) Author: Joshua Cannon (thejcannon) * Date: 2018-12-17 15:06
I created issue35498 about .parents rejecting slices as well. (It was pointed out this discussion would probably decide that issue's fate)
I think that .parents looking like a duck, but not quacking like one isn't very pythonic.

Besides, the fact that p.parents[len(p.parents)-2] is allowed but p.parents[-2] is not just seems like extra steps. There's also list(p.parents)[-2], which is still not ideal. In either case, I'd imagine authors to put a comment like "PathLib .parents doesn't support negative indexes", which goes to show clients are expecting negative indices to work.

I see that this issue is several years old. I'm happy to shepherd it if it needs further contributions.
msg352281 - (view) Author: Julien Palard (mdk) * (Python committer) Date: 2019-09-13 10:30
I checked conversation in #7951, tells about an ambiguity because it could be an index from a sequence or a key for a dict, like {-1: "foo"}.

Here there is no such confusion.

Confusion *may* arrise from the fact that it's not composed of parts, but more like it's already sliced, I mean it does NOT look like:

   ['/', 'home', 'mdk', 'clones', 'python']

It's more like:

   ['/home/mdk/clones/python', '/home/mdk/clones', '/home/mdk', '/home', '/']


In fact I'd say it behave more like a function call than a sequence access, I read:

   pathlib.Path.cwd().parents[1]

a bit like:

   pathlib.Path.cwd().parents(go_down=1)

It may explain why negative indices or slices were initially not implemented: It already looks like the result of a slice.
msg352322 - (view) Author: Joshua Cannon (thejcannon) * Date: 2019-09-13 13:26
> it may explain why negative indices or slices were initially not implemented: It already looks like the result of a slice.

Sure the values of the sequence could be thought of as being increasingly smaller slices of some other sequence, however I don't think it changes the fact that "parents" is a sequence, and sequences have well-defined semantics for negative indices and slices. Semantics which people expect, and have to find smelly workarounds for.
msg363337 - (view) Author: victorg (victorg) Date: 2020-03-04 08:13
Allow negative indexes that could be usefull. 

Example: to compare 2 or more Path, if they come from the same top directory

from pathlib import Path
a = Path("/a/testpy/cpython/config.log")
b = Path("/b/testpy/cpython/config.log")
c = Path("/a/otherfolder/text.txt")
print(f"a.parents[-2] == b.parents[-2] -> {a.parents[-2] == b.parents[-2]}") # False
print(f"a.parents[-2] == c.parents[-2] -> {a.parents[-2] == c.parents[-2]}") # True 
# index = -2 because -1 is "/"
msg363743 - (view) Author: Julin (ju-sh) * Date: 2020-03-09 15:54
Can't this be implemented? This is something that a user would expect. Intuitive. And it looks as if it is an easy change to make that doesn't disturb anything else.
msg373269 - (view) Author: Maxwell Ballenger (maxballenger) Date: 2020-07-08 01:41
Use case: I want to see if a Path is a descendent of /tmp.

if filepath.parents[-2] == Path('tmp'):

turns into

if filepath.parents[len(filepath.parents)-2] == Path('tmp'):
msg373281 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-07-08 06:57
Maxwell, in your case a more correct and obvious way is

    Path('/tmp') in filepath.parents

although it may be not very efficient. Using the is_relative_to() method may be more efficient and obvious.
msg375099 - (view) Author: Yaroslav Pankovych (ypank) * Date: 2020-08-10 07:25
That's kinda weird for python. I mean, in regular list/etc if I need the last element, I'd normally do list[-1], but here to get the last parent, I need to actually know how many parents do I have.

So now, I can do something like this:

>>> parents_count = len(path.parents) - 1
>>> path.parents[parents_count]
PosixPath('.')
msg375100 - (view) Author: Yaroslav Pankovych (ypank) * Date: 2020-08-10 07:25
Here's possible fix: https://github.com/python/cpython/pull/21799
msg375788 - (view) Author: Yaroslav Pankovych (ypank) * Date: 2020-08-22 08:41
Any thoughts about that folks? It's a pretty old bug, let's decide smth for it.
msg381452 - (view) Author: Paul Ganssle (p-ganssle) * (Python committer) Date: 2020-11-19 18:30
I am not seeing any compelling reasons to avoid supporting negative indexes *or* slices here.

If I had to guess about the confusing semantics of negative indices, I would guess it's the fact that the index in the -1 position for a non-empty Path will always be `Path('.')`. Since that's not terribly useful, it might be reasonable to have negative indices start counting at `len(p)-2`.

That said, I don't think this is a big deal, and I think we have more speculation on why this was avoided in the first place than we have actual objections to changing it, so I vote for changing it.

I think our best option is to say that the semantics of indexing `.parents` should be the same as indexing the result of casting it to a tuple, so this should be true:

    p = Path(x)
    assert p.parents[y] == tuple(p.parents)[y]

For all values of `x` and `y`.

I've gone ahead and changed the version support matrix to 3.10 only, since I think that this was a deliberate choice and we should be considering this an enhancement rather than a bugfix. That said, I'll admit that it's on the borderline — the semantics of sequences are unambiguous (see, which says that sequences support both slices and negative indices: https://docs.python.org/3/library/stdtypes.html#typesseq ), and PEP 428 explicitly says that .parents returns a "an immutable sequence of the path's logical ancestors": https://www.python.org/dev/peps/pep-0428/#sequence-like-access . So if someone is motivated to try and make the case that this is a bugfix that could be backported to earlier supported versions, I won't stand in their way.
msg381563 - (view) Author: Yaroslav Pankovych (ypank) * Date: 2020-11-21 14:10
That makes sense, but should we have this behaviour only for negative indices? 

We'll end up with something lie:

path.parents[len(path.parents) - 1] != path.parents[-1]

I think that is should be consistent regardless of negative/positive indices.
msg381654 - (view) Author: Yaroslav Pankovych (ypank) * Date: 2020-11-23 08:57
And it looks like a special case, so "Special cases aren't special enough to break the rules."
msg381670 - (view) Author: Paul Ganssle (p-ganssle) * (Python committer) Date: 2020-11-23 15:11
I think you may have confused my thoughts as to why this might be considered ambiguous with an actual suggestion for what the semantics should be.

I think that we should stick with `p.parents[x] == tuple(p.parents)[x]` for any valid value of `x`, which means that `p.parents[-1]` will always be `Path('.')` for any non-empty `p`.
msg381678 - (view) Author: Yaroslav Pankovych (ypank) * Date: 2020-11-23 16:40
Agree with that, it currently supports this behavior.
msg381694 - (view) Author: Paul Ganssle (p-ganssle) * (Python committer) Date: 2020-11-23 20:06
New changeset 79d2e62c008446fbbc6f264bb8a30e2d38b6ff58 by Yaroslav Pankovych in branch 'master':
Added support for negative indexes to PurePath.parents (GH-21799)
https://github.com/python/cpython/commit/79d2e62c008446fbbc6f264bb8a30e2d38b6ff58
History
Date User Action Args
2020-11-23 20:06:38p-gansslesetstatus: open -> closed
resolution: fixed
messages: + msg381694

stage: patch review -> resolved
2020-11-23 16:40:58ypanksetmessages: + msg381678
2020-11-23 15:11:05p-gansslesetmessages: + msg381670
2020-11-23 08:57:53ypanksetmessages: + msg381654
2020-11-21 14:10:23ypanksetmessages: + msg381563
2020-11-20 15:46:16p-ganssleunlinkissue35498 dependencies
2020-11-19 18:30:47p-gansslesetnosy: + p-ganssle

messages: + msg381452
versions: - Python 3.5, Python 3.6, Python 3.7, Python 3.8, Python 3.9
2020-08-22 08:41:43ypanksetmessages: + msg375788
2020-08-10 07:25:43ypanksetmessages: + msg375100
2020-08-10 07:25:09ypanksetversions: + Python 3.5, Python 3.6, Python 3.7, Python 3.9, Python 3.10
nosy: + ypank

messages: + msg375099
pull_requests: + pull_request20938

stage: patch review
2020-08-10 04:13:11xtreaklinkissue41511 superseder
2020-07-08 06:57:03serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg373281
2020-07-08 01:41:55maxballengersetnosy: + maxballenger
messages: + msg373269
2020-03-09 15:54:43ju-shsetnosy: + ju-sh
messages: + msg363743
2020-03-04 08:13:16victorgsetfiles: + allowNegativeIndexParents.patch
nosy: + victorg
messages: + msg363337

2019-09-13 13:26:18thejcannonsetmessages: + msg352322
2019-09-13 10:30:40mdksetnosy: + mdk
messages: + msg352281
2018-12-17 15:06:05thejcannonsetnosy: + thejcannon
messages: + msg332008
2018-12-16 16:35:30BreamoreBoysetnosy: - BreamoreBoy
2018-12-16 11:27:24serhiy.storchakasettype: behavior -> enhancement
versions: + Python 3.8, - Python 3.4, Python 3.5
2018-12-16 11:27:04serhiy.storchakalinkissue35498 dependencies
2014-08-18 19:17:12akirasetmessages: + msg225503
2014-07-14 20:59:14BreamoreBoysetnosy: + BreamoreBoy
messages: + msg223059
2014-07-14 20:16:46akirasetmessages: + msg223054
2014-07-14 18:55:58barrysetmessages: + msg223048
2014-07-14 18:53:23barrysetnosy: + barry
2014-04-08 09:05:31akirasetmessages: + msg215746
2014-03-24 19:57:38r.david.murraysetmessages: + msg214717
2014-03-24 19:50:16pitrousetmessages: + msg214716
2014-03-24 18:55:41r.david.murraysetmessages: + msg214709
2014-03-24 18:53:35r.david.murraysetnosy: + pitrou, r.david.murray
2014-03-24 18:50:12akirasettype: behavior
2014-03-23 22:16:51akiracreate