Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pathlib.PurePath.parents rejects negative indexes #65240

Closed
4kir4 mannequin opened this issue Mar 23, 2014 · 26 comments
Closed

pathlib.PurePath.parents rejects negative indexes #65240

4kir4 mannequin opened this issue Mar 23, 2014 · 26 comments
Labels
3.10 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@4kir4
Copy link
Mannequin

4kir4 mannequin commented Mar 23, 2014

BPO 21041
Nosy @warsaw, @pitrou, @bitdancer, @4kir4, @serhiy-storchaka, @MojoVampire, @JulienPalard, @pganssle, @thejcannon, @ju-sh, @VictorGob, @maxballenger, @ypankovych
PRs
  • bpo-21041: Add negative indexing to pathlib path parents #21799
  • Files
  • pathlib-parents-allow-negative-index.patch: the fix and tests
  • allowNegativeIndexParents.patch: the fix w/o tests
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2020-11-23.20:06:38.142>
    created_at = <Date 2014-03-23.22:16:51.197>
    labels = ['type-feature', 'library', '3.10']
    title = 'pathlib.PurePath.parents rejects negative indexes'
    updated_at = <Date 2021-10-08.18:12:25.910>
    user = 'https://github.com/4kir4'

    bugs.python.org fields:

    activity = <Date 2021-10-08.18:12:25.910>
    actor = 'josh.r'
    assignee = 'none'
    closed = True
    closed_date = <Date 2020-11-23.20:06:38.142>
    closer = 'p-ganssle'
    components = ['Library (Lib)']
    creation = <Date 2014-03-23.22:16:51.197>
    creator = 'akira'
    dependencies = []
    files = ['34595', '48949']
    hgrepos = []
    issue_num = 21041
    keywords = ['patch']
    message_count = 26.0
    messages = ['214642', '214709', '214716', '214717', '215746', '223048', '223054', '223059', '225503', '332008', '352281', '352322', '363337', '363743', '373269', '373281', '375099', '375100', '375788', '381452', '381563', '381654', '381670', '381678', '381694', '403489']
    nosy_count = 13.0
    nosy_names = ['barry', 'pitrou', 'r.david.murray', 'akira', 'serhiy.storchaka', 'josh.r', 'mdk', 'p-ganssle', 'thejcannon', 'ju-sh', 'victorg', 'maxballenger', 'ypank']
    pr_nums = ['21799']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue21041'
    versions = ['Python 3.10']

    @4kir4
    Copy link
    Mannequin Author

    4kir4 mannequin commented Mar 23, 2014

    pathlib.PurePath.parents is a sequence 1 but it rejects negative indexes:

      >>> from pathlib import PurePath
      >>> PurePath('a/b/c').parents[-2]
      Traceback (most recent call last):
      ...
      IndexError: -2

    Sequences in Python interpret negative indexes as len(seq) + i 2

    I've included the patch that fixes the issue and adds corresponding tests. No documentation changes are needed.

    @4kir4 4kir4 mannequin added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Mar 23, 2014
    @bitdancer
    Copy link
    Member

    I think this is a doc bug. That object shouldn't be called a sequence, since it isn't one.

    @pitrou
    Copy link
    Member

    pitrou commented Mar 24, 2014

    Well, it is a sequence, it's just that it doesn't respect the convention about negative indices :-)

    As to why they are disallowed, I don't remember exactly (!) but I think it's because the exact semantics would be confusing otherwise.

    @bitdancer
    Copy link
    Member

    Which is exactly what I mean by saying it is not a sequence. It is 'sequence-like'. Kind of like email Messages are dict-like: they share many methods and behaviors, but the exact behaviors and semantics are different.

    @4kir4
    Copy link
    Mannequin Author

    4kir4 mannequin commented Apr 8, 2014

    From https://docs.python.org/3/glossary.html#term-sequence

    An iterable which supports efficient element access using integer indices via the __getitem__() special method and defines a __len__() method that returns the length of the sequence.

    .parents *is* a sequence. And it *is* confusing that it doesn't accept negative indexes -- that is how I've encountered the bug.

    Antoine, could you elaborate on what are the negative consequences of negative indexes to justify breaking the expectations?

    @warsaw
    Copy link
    Member

    warsaw commented Jul 14, 2014

    Aren't negative indexes well defined in Python? E.g.

    >>> p = Path('/tmp/tmp123/foo/bar/baz.xz')
    >>> p.parents[len(p.parents)-2]
    PosixPath('/tmp')

    p.parents[-2] should == p.parents[len(p.parents)-2]

    @4kir4
    Copy link
    Mannequin Author

    4kir4 mannequin commented Jul 14, 2014

    Aren't negative indexes well defined in Python?

    yes. I've provided the link to Python docs 1 in msg214642 that
    explicitly defines the behavior:

    If i or j is negative, the index is relative to the end of the string:
    len(s) + i or len(s) + j is substituted. But note that -0 is still 0.

    @BreamoreBoy
    Copy link
    Mannequin

    BreamoreBoy mannequin commented Jul 14, 2014

    bpo-7951 has an interesting debate on negative indexes that is possibly applicable here.

    @4kir4
    Copy link
    Mannequin Author

    4kir4 mannequin commented Aug 18, 2014

    bpo-7951 has an interesting debate on negative indexes that is possibly applicable here.

    Mark could you point to a message that explains why p.parents[-2] is worse
    than p.parents[len(p.parents)-2]?

    @serhiy-storchaka serhiy-storchaka added 3.8 only security fixes type-feature A feature request or enhancement and removed type-bug An unexpected behavior, bug, or error labels Dec 16, 2018
    @thejcannon
    Copy link
    Mannequin

    thejcannon mannequin commented Dec 17, 2018

    I created bpo-35498 about .parents rejecting slices as well. (It was pointed out this discussion would probably decide that issue's fate)
    I think that .parents looking like a duck, but not quacking like one isn't very pythonic.

    Besides, the fact that p.parents[len(p.parents)-2] is allowed but p.parents[-2] is not just seems like extra steps. There's also list(p.parents)[-2], which is still not ideal. In either case, I'd imagine authors to put a comment like "PathLib .parents doesn't support negative indexes", which goes to show clients are expecting negative indices to work.

    I see that this issue is several years old. I'm happy to shepherd it if it needs further contributions.

    @JulienPalard
    Copy link
    Member

    I checked conversation in bpo-7951, tells about an ambiguity because it could be an index from a sequence or a key for a dict, like {-1: "foo"}.

    Here there is no such confusion.

    Confusion *may* arrise from the fact that it's not composed of parts, but more like it's already sliced, I mean it does NOT look like:

    ['/', 'home', 'mdk', 'clones', 'python']

    It's more like:

    ['/home/mdk/clones/python', '/home/mdk/clones', '/home/mdk', '/home', '/']

    In fact I'd say it behave more like a function call than a sequence access, I read:

    pathlib.Path.cwd().parents[1]

    a bit like:

    pathlib.Path.cwd().parents(go_down=1)

    It may explain why negative indices or slices were initially not implemented: It already looks like the result of a slice.

    @thejcannon
    Copy link
    Mannequin

    thejcannon mannequin commented Sep 13, 2019

    it may explain why negative indices or slices were initially not implemented: It already looks like the result of a slice.

    Sure the values of the sequence could be thought of as being increasingly smaller slices of some other sequence, however I don't think it changes the fact that "parents" is a sequence, and sequences have well-defined semantics for negative indices and slices. Semantics which people expect, and have to find smelly workarounds for.

    @VictorGob
    Copy link
    Mannequin

    VictorGob mannequin commented Mar 4, 2020

    Allow negative indexes that could be usefull.

    Example: to compare 2 or more Path, if they come from the same top directory

    from pathlib import Path
    a = Path("/a/testpy/cpython/config.log")
    b = Path("/b/testpy/cpython/config.log")
    c = Path("/a/otherfolder/text.txt")
    print(f"a.parents[-2] == b.parents[-2] -> {a.parents[-2] == b.parents[-2]}") # False
    print(f"a.parents[-2] == c.parents[-2] -> {a.parents[-2] == c.parents[-2]}") # True 
    # index = -2 because -1 is "/"

    @ju-sh
    Copy link
    Mannequin

    ju-sh mannequin commented Mar 9, 2020

    Can't this be implemented? This is something that a user would expect. Intuitive. And it looks as if it is an easy change to make that doesn't disturb anything else.

    @maxballenger
    Copy link
    Mannequin

    maxballenger mannequin commented Jul 8, 2020

    Use case: I want to see if a Path is a descendent of /tmp.

    if filepath.parents[-2] == Path('tmp'):

    turns into

    if filepath.parents[len(filepath.parents)-2] == Path('tmp'):

    @serhiy-storchaka
    Copy link
    Member

    Maxwell, in your case a more correct and obvious way is

    Path('/tmp') in filepath.parents
    

    although it may be not very efficient. Using the is_relative_to() method may be more efficient and obvious.

    @ypankovych
    Copy link
    Mannequin

    ypankovych mannequin commented Aug 10, 2020

    That's kinda weird for python. I mean, in regular list/etc if I need the last element, I'd normally do list[-1], but here to get the last parent, I need to actually know how many parents do I have.

    So now, I can do something like this:

    >>> parents_count = len(path.parents) - 1
    >>> path.parents[parents_count]
    PosixPath('.')

    @ypankovych ypankovych mannequin added 3.7 (EOL) end of life 3.9 only security fixes 3.10 only security fixes labels Aug 10, 2020
    @ypankovych
    Copy link
    Mannequin

    ypankovych mannequin commented Aug 10, 2020

    Here's possible fix: #21799

    @ypankovych
    Copy link
    Mannequin

    ypankovych mannequin commented Aug 22, 2020

    Any thoughts about that folks? It's a pretty old bug, let's decide smth for it.

    @pganssle
    Copy link
    Member

    I am not seeing any compelling reasons to avoid supporting negative indexes *or* slices here.

    If I had to guess about the confusing semantics of negative indices, I would guess it's the fact that the index in the -1 position for a non-empty Path will always be Path('.'). Since that's not terribly useful, it might be reasonable to have negative indices start counting at len(p)-2.

    That said, I don't think this is a big deal, and I think we have more speculation on why this was avoided in the first place than we have actual objections to changing it, so I vote for changing it.

    I think our best option is to say that the semantics of indexing .parents should be the same as indexing the result of casting it to a tuple, so this should be true:

        p = Path(x)
        assert p.parents[y] == tuple(p.parents)[y]

    For all values of x and y.

    I've gone ahead and changed the version support matrix to 3.10 only, since I think that this was a deliberate choice and we should be considering this an enhancement rather than a bugfix. That said, I'll admit that it's on the borderline — the semantics of sequences are unambiguous (see, which says that sequences support both slices and negative indices: https://docs.python.org/3/library/stdtypes.html#typesseq ), and PEP-428 explicitly says that .parents returns a "an immutable sequence of the path's logical ancestors": https://www.python.org/dev/peps/pep-0428/#sequence-like-access . So if someone is motivated to try and make the case that this is a bugfix that could be backported to earlier supported versions, I won't stand in their way.

    @pganssle pganssle removed 3.7 (EOL) end of life 3.8 only security fixes labels Nov 19, 2020
    @pganssle pganssle removed the 3.9 only security fixes label Nov 19, 2020
    @ypankovych
    Copy link
    Mannequin

    ypankovych mannequin commented Nov 21, 2020

    That makes sense, but should we have this behaviour only for negative indices?

    We'll end up with something lie:

    path.parents[len(path.parents) - 1] != path.parents[-1]

    I think that is should be consistent regardless of negative/positive indices.

    @ypankovych
    Copy link
    Mannequin

    ypankovych mannequin commented Nov 23, 2020

    And it looks like a special case, so "Special cases aren't special enough to break the rules."

    @pganssle
    Copy link
    Member

    I think you may have confused my thoughts as to why this might be considered ambiguous with an actual suggestion for what the semantics should be.

    I think that we should stick with p.parents[x] == tuple(p.parents)[x] for any valid value of x, which means that p.parents[-1] will always be Path('.') for any non-empty p.

    @ypankovych
    Copy link
    Mannequin

    ypankovych mannequin commented Nov 23, 2020

    Agree with that, it currently supports this behavior.

    @pganssle
    Copy link
    Member

    New changeset 79d2e62 by Yaroslav Pankovych in branch 'master':
    Added support for negative indexes to PurePath.parents (GH-21799)
    79d2e62

    @MojoVampire
    Copy link
    Mannequin

    MojoVampire mannequin commented Oct 8, 2021

    Negative indexing is broken for absolute paths, see bpo-45414.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.10 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    6 participants