This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Confusing exception in Path().with_name
Type: Stage: resolved
Components: Library (Lib) Versions: Python 3.6, Python 3.5
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: ethan.furman, iritkatriel, pitrou, r.david.murray, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2016-05-31 00:22 by Antony.Lee, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
pathlib-splitroot.patch Antony.Lee, 2016-05-31 06:35 review
Messages (12)
msg266724 - (view) Author: Antony Lee (Antony.Lee) * Date: 2016-05-31 00:22
`Path().with_name` can fail with a lot of different exceptions:

    >>> Path("foo").with_name(0)
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "/usr/lib/python3.5/pathlib.py", line 800, in with_name
        raise ValueError("Invalid name %r" % (name))
    ValueError: Invalid name 0

    >>> Path("foo").with_name(1)
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "/usr/lib/python3.5/pathlib.py", line 797, in with_name
        drv, root, parts = self._flavour.parse_parts((name,))
    File "/usr/lib/python3.5/pathlib.py", line 62, in parse_parts
        drv, root, rel = self.splitroot(part)
    File "/usr/lib/python3.5/pathlib.py", line 268, in splitroot
        if part and part[0] == sep:
    TypeError: 'int' object is not subscriptable

    >>> Path("foo").with_name({})
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "/usr/lib/python3.5/pathlib.py", line 800, in with_name
        raise ValueError("Invalid name %r" % (name))
    ValueError: Invalid name {}

    >>> Path("foo").with_name({0: 1})
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "/usr/lib/python3.5/pathlib.py", line 797, in with_name
        drv, root, parts = self._flavour.parse_parts((name,))
    File "/usr/lib/python3.5/pathlib.py", line 69, in parse_parts
        parsed.append(sys.intern(rel))
    TypeError: must be str, not dict

    >>> Path("foo").with_name({"a": "b"})
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "/usr/lib/python3.5/pathlib.py", line 797, in with_name
        drv, root, parts = self._flavour.parse_parts((name,))
    File "/usr/lib/python3.5/pathlib.py", line 62, in parse_parts
        drv, root, rel = self.splitroot(part)
    File "/usr/lib/python3.5/pathlib.py", line 268, in splitroot
        if part and part[0] == sep:
    KeyError: 0

While most are fairly clear, the last one is particularly confusing IMO; additionally, looking at the implementation of `_Flavour.parse_parts` there seems to be room for even more confusion with a properly crafted dict.

Of course, with Python's dynamicity a lot of exceptions can be triggered at weird places using carefully crafted objects, but I think pathlib should at least raise a clear exception when passed an incorrect builtin type.
msg266727 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2016-05-31 00:50
I don't think so.  Python uses duck typing, and one of the consequences of that is you can get weird errors if you pass in a type that sort-of-works but doesn't really.  Arbitrarily restricting the input type is not something we do (that's what the whole static type checking thing is about...doing that as an external lint process.)
msg266730 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2016-05-31 02:58
I think the request is to raise a single PathlibError instead of the broad range of possible errors.  Something like:

try:
   blah blah
except TypeError as e:
   raise PathlibError(str(e))

Since pathlib is a high level library this seems appropriate.
msg266733 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2016-05-31 04:18
Again, I don't think so.  We don't generally wrap TypeErrors or ValueErrors.  On the other hand, if the try/except can *add* additional contextual information about the error, it might be worthwhile.
msg266738 - (view) Author: Antony Lee (Antony.Lee) * Date: 2016-05-31 06:35
Actually, a simpler approach may be the attached patch (pathlib-splitroot.patch), in which case the exception is

    AttributeError: 'dict' object has no attribute 'lstrip'

which is pretty commonly seen when a wrong type is passed to a function.

(Note that it is already possible to trigger this exception by passing in `{0: "/"}` so it's no worse than before)
msg266754 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2016-05-31 15:07
It's a resaonable approach, but I question if the gain is worth the chance of introducing bugs (behavior changes...for example, a None argument would be handled differently by the patch).
msg266765 - (view) Author: Antony Lee (Antony.Lee) * Date: 2016-05-31 18:27
Would it?  Note that splitroot is not public, and some quick tests did not show any other behavior difference from the public API side.
msg266766 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-05-31 18:34
The "part and part[0] == sep" check is here for speed. pathlib-splitroot.patch removes this optimization.
msg266767 - (view) Author: Antony Lee (Antony.Lee) * Date: 2016-05-31 18:46
First, I doubt that this optimization actually changes anything measurable, but if you want you can always replace that line by "if part.startswith(sep)" (also solving the original issue).

Additionally, note that `parse_parts` (the only function to call `splitroot` already (and redundantly) checks that `part` is truthy before calling `splitroot`.  There are a couple of other "optimizations" in the same function (`if x and x != '.'`, `if rel and rel != '.'`) for which I'd be surprised if they truly mattered.
msg266776 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2016-05-31 20:50
Just because it isn't public doesn't mean no one calls it :).  But seriously, that's why I said "the risk".  Is the small benefit of the codechange worth the small risk of breaking something?  I'm not answering that question, I'm posing it.  Call me -0 on the change.

I agree that the optimization argument doesn't really seem relevant.  If someone cares about performance they aren't going to be using pathlib.
msg266777 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2016-05-31 20:53
To clarify that performance statement: if someone is optimizing a tight loop, the first thing they'll likely do is drop down to the lower level calls to avoid all of the pathlib overhead.
msg407160 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2021-11-27 18:25
I agree that it's not worth changing the code for this. If someone tries to pass {"a": "b"} as a name then, yeah, they get a weird error message.
History
Date User Action Args
2022-04-11 14:58:31adminsetgithub: 71348
2021-11-27 19:07:20Antony.Leesetnosy: - Antony.Lee
2021-11-27 18:25:09iritkatrielsetstatus: open -> closed

nosy: + iritkatriel
messages: + msg407160

resolution: rejected
stage: resolved
2016-05-31 20:53:13r.david.murraysetmessages: + msg266777
2016-05-31 20:50:33r.david.murraysetmessages: + msg266776
2016-05-31 18:46:50Antony.Leesetmessages: + msg266767
2016-05-31 18:34:37serhiy.storchakasetnosy: + serhiy.storchaka, pitrou
messages: + msg266766
2016-05-31 18:27:49Antony.Leesetmessages: + msg266765
2016-05-31 15:07:19r.david.murraysetmessages: + msg266754
2016-05-31 06:35:50Antony.Leesetfiles: + pathlib-splitroot.patch
keywords: + patch
messages: + msg266738
2016-05-31 04:18:27r.david.murraysetmessages: + msg266733
2016-05-31 02:58:11ethan.furmansetnosy: + ethan.furman
messages: + msg266730
2016-05-31 00:50:18r.david.murraysetnosy: + r.david.murray
messages: + msg266727
2016-05-31 00:22:15Antony.Leecreate