Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better stdlib support for Path objects using .path attribute #66760

Closed
warsaw opened this issue Oct 6, 2014 · 44 comments
Closed

Better stdlib support for Path objects using .path attribute #66760

warsaw opened this issue Oct 6, 2014 · 44 comments
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@warsaw
Copy link
Member

warsaw commented Oct 6, 2014

BPO 22570
Nosy @gvanrossum, @warsaw, @brettcannon, @birkenfeld, @pitrou, @ezio-melotti, @cool-RR, @ethanfurman, @serhiy-storchaka

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2016-05-19.20:18:47.663>
created_at = <Date 2014-10-06.15:33:51.413>
labels = ['type-feature', 'library']
title = 'Better stdlib support for Path objects using .path attribute'
updated_at = <Date 2016-06-02.17:13:36.987>
user = 'https://github.com/warsaw'

bugs.python.org fields:

activity = <Date 2016-06-02.17:13:36.987>
actor = 'ethan.furman'
assignee = 'none'
closed = True
closed_date = <Date 2016-05-19.20:18:47.663>
closer = 'gvanrossum'
components = ['Library (Lib)']
creation = <Date 2014-10-06.15:33:51.413>
creator = 'barry'
dependencies = []
files = []
hgrepos = []
issue_num = 22570
keywords = []
message_count = 44.0
messages = ['228704', '228707', '228713', '228715', '228716', '228717', '228718', '228767', '229544', '229546', '229547', '229548', '229549', '229550', '229552', '253124', '257570', '257593', '257597', '257599', '257615', '257616', '257621', '257622', '257624', '257630', '257632', '257633', '257634', '257635', '257638', '257643', '257647', '257913', '257920', '257924', '257926', '257928', '262935', '262938', '265414', '265843', '265844', '265892']
nosy_count = 14.0
nosy_names = ['gvanrossum', 'barry', 'brett.cannon', 'georg.brandl', 'pitrou', 'bronger', 'ezio.melotti', 'Arfrever', 'daniel.ugra', 'cool-RR', 'ethan.furman', 'tshepang', 'python-dev', 'serhiy.storchaka']
pr_nums = []
priority = 'normal'
resolution = 'fixed'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue22570'
versions = ['Python 3.4', 'Python 3.5', 'Python 3.6']

@warsaw
Copy link
Member Author

warsaw commented Oct 6, 2014

pathlib is really nice, but currently it's rather inconvenient to use due to the lack of support in other parts of the stdlib for Path objects. For historical reasons, everything accepts string paths, but few places accept Paths. As an example: configparser.ConfigParser.read() but there are lots of others.

I'm opening this bug to start a conversation about better support for Path objects in the stdlib. Against all hope, I wish there was a simple way to extend the compatibility, but I don't like having to sprinkle str(some_path) calls everywhere (kind of defeats the purpose of having the nicer pathlib API IMHO). I suspect instead that it will be a matter of adding type tests or str() conversions to the relevant methods, but there may be other issues to discuss, like is it even a good idea to do this? ;)

@warsaw warsaw added the stdlib Python modules in the Lib dir label Oct 6, 2014
@birkenfeld
Copy link
Member

Since we're unlikely to ever change all the places, I'd say it's better to be consistent. I'd rather write str(path) all over the place than having to look up in the docs each time if that specific API happens to support passing Paths directly.

However, Antoine has been positive towards more utility methods on Path objects. (Not that ConfigParser read() would fall in that category :)

@warsaw
Copy link
Member Author

warsaw commented Oct 6, 2014

On Oct 06, 2014, at 03:43 PM, Georg Brandl wrote:

I'd rather write str(path) all over the place than having to look up in the
docs each time if that specific API happens to support passing Paths
directly.

Have you tried to write a large-ish application using path objects?
str-infection is definitely a disincentive to using pathlib. :/

@birkenfeld
Copy link
Member

I was about to suggest deriving your own Path class from Path and str, but got a base class layout conflict because Path objects define lots of __slots__ :(

@pitrou
Copy link
Member

pitrou commented Oct 6, 2014

I was about to suggest deriving your own Path class from Path and str

That would be a rather horrible solution.
It has already been suggested to create a "path protocol". I would suggest Barry tries to float and investigate the idea on python-ideas:
https://mail.python.org/pipermail/python-ideas/2014-May/027869.html

@birkenfeld
Copy link
Member

That would be a rather horrible solution.

I know :)

@pitrou
Copy link
Member

pitrou commented Oct 6, 2014

As for adding convenience methods to Path objects, yes, I'm quite open to that. Best is to open an issue when you have a specific idea (and a patch :-)).

@brettcannon
Copy link
Member

I think I'm missing something here because the idea of doing path = str(path) at the API boundary for an old function to support both Path and str objects for paths seems fairly minimal. Only when manipulating a path is wanting a Path object going to come up, and in that case can't you just do path = pathlib.Path(path) instead?

@ezio-melotti ezio-melotti added the type-feature A feature request or enhancement label Oct 7, 2014
@cool-RR
Copy link
Mannequin

cool-RR mannequin commented Oct 16, 2014

I agree with Brett. Making old functions support Path sounds trivial to me, and I don't think there are any backward compatibility issues. Having to do str(path) every time you call a function is ugly. So I think that all stdlib functions that currently take a string path should be able to take a Path object.

I'll add to that the functions in the zipfile module.

@birkenfeld
Copy link
Member

path = str(path) is minimal, but has the side effect of accepting almost any object, which is definitely not what you'd like ("where did that file named '<type object at ...>' come from?!")

@cool-RR
Copy link
Mannequin

cool-RR mannequin commented Oct 16, 2014

Fine, so you add an if isinstance... in front of it and you're done.

@birkenfeld
Copy link
Member

Making old functions support Path sounds trivial to me

We're looking forward to trivial patches that enable Path handling consistently throughout the stdlib.

Fine, so you add an if isinstance... in front of it and you're done.

Easier said than done in C modules.

@cool-RR
Copy link
Mannequin

cool-RR mannequin commented Oct 16, 2014

Georg: You're right, I forgot about C modules which make this not as trivial as I thought.

As for Python modules: If there's general agreement that this is a feature we want, I'll be happy to make a patch for the zipfile module to accept Path arguments.

@bitdancer
Copy link
Member

Well, if you use an isinstance check you privilege the stdlib Path over any other pathlike implementation. Since it *is* in the stdlib, this isn't an automatic reason for rejection, but it does have a bit of a code smell to it. Why should everything that deals with path strings have to have intimate knowledge of Path objects?

I originally wrote here "Maybe we need a __path__ magic method" as a half-joke, but rereading the issue I see that this has in fact been proposed seriously, and referenced by Antoine (the pathlib author).

I'm -1 on just sprinkling support for Path throughout the stdlib. Do it in a universally applicable fashion or don't do it at all, IMO.

@pitrou
Copy link
Member

pitrou commented Oct 16, 2014

I'm -1 on just sprinkling support for Path throughout the stdlib.
Do it in a universally applicable fashion or don't do it at all, IMO.

How about doing it per module?

@bronger
Copy link
Mannequin

bronger mannequin commented Oct 17, 2015

Please be conservative with adding methods to Path.

FWIW, my own experiences with pathlib have shown that utility methods have the disadvantage of duplicating well-established Python idioms. This reduces code readability in my opinion. While it is desirable that the rather inconvenient os.path may be superseded by pathlib in the long run, utility methods like Path.glob, Path.open, Path.mkdir, and Path.readtext have convenient stdlib counterparts with long tradition, and using the methods seems disruptive and confusing to me.

This is a further argument for having a path protocol instead IMO.

@gvanrossum
Copy link
Member

Random idea: what if pathlib.Path defined a .path attribute that was a plain string? Then you could write p.path instead of str(p), and "if hasattr(p, 'path'): p = p.path". This would be the new protocol. Advantage is also that DirEntry (returned by the new os.scandir()) already supports it. :-)

@cool-RR
Copy link
Mannequin

cool-RR mannequin commented Jan 6, 2016

I thought about it some more, and personally I'd prefer each function to do str(path) internally rather than if hasattr(p, 'path'): p = p.path. Even if it means we'll have to deal with "where did that file named '<type object at ...>' come from?!" errors. I think it's clumsy that the path protocol is to access path.path. We already have a protocol for getting a string out of an object and it's str(object). I think we should use it even if it makes debugging harder in the case where someone mistakenly passes a non-path object to a function that wants a path. (And without using isinstance either.)

@serhiy-storchaka
Copy link
Member

str(object) is not a protocol for getting a string out of an object. It's a protocol for getting a string for print(). str is defined for every object and therefore is useless for getting a string out of "string-like" object (as float for floats and bytes for bytes). Perhaps we need a new special method string that relates to str as index to int.

@pitrou
Copy link
Member

pitrou commented Jan 6, 2016

Here the aim is really to distinguish path-like objects from other objects, not to accept arbitrary strings.

A ".path" attribute sounds like the wrong name, though: a Path has a path? And the Path's path is not a Path? Ouch :-)

@gvanrossum
Copy link
Member

I think it's actually very reasonable for a Path to have a path attribute
that's a string. The DirEntry has two string attributes: name (the last
component) and path (the full path). The Path object already has the
former. Adding the latter makes sense to me. After all you've gotta give it
*some* name, and 'path' is used (unsurprisingly) in this meaning already in
many places.

The shortest idiom in libraries wanting to support this would be

    path = gettattr(arg, 'path', arg)

This extracts the path attribute from a DirEntry or Path object, and
assumes the argument is a string otherwise. I think this is relatively
reasonable to encode in C as well.

@brettcannon
Copy link
Member

Would location work as an attribute name?

@gvanrossum
Copy link
Member

Only if we changed DirEntry to support that too. But it's a kind of
high-falootin' word that also has some other connotations (e.g.
geographical location, and the HTTP Location header). I've never heard it
use in relation to filenames -- those are invariably called some variant of
file, file name, path, full path. Really, the argument Antoine brings up
doesn't hold much weight.

@brettcannon
Copy link
Member

Personally I thought the name path fit; just trying to see if some other option might work that Antoine would also like.

@pitrou
Copy link
Member

pitrou commented Jan 6, 2016

In any case I don't think "location" is any better ;-) If "path" fits other people then good.

@gvanrossum
Copy link
Member

OK, I'll add 'path' to unblock changes to the stdlib (but I won't close
this issue).

@python-dev
Copy link
Mannequin

python-dev mannequin commented Jan 6, 2016

New changeset 7e9605697dfc by Guido van Rossum in branch '3.4':
Issue bpo-22570: Add 'path' attribute to pathlib.Path objects.
https://hg.python.org/cpython/rev/7e9605697dfc

New changeset 9c49c417a68a by Guido van Rossum in branch '3.5':
Issue bpo-22570: Add 'path' attribute to pathlib.Path objects. (Merge 3.4->3.5)
https://hg.python.org/cpython/rev/9c49c417a68a

New changeset d5f96a5da219 by Guido van Rossum in branch 'default':
Issue bpo-22570: Add 'path' attribute to pathlib.Path objects. (Merge 3.5->3.6)
https://hg.python.org/cpython/rev/d5f96a5da219

@birkenfeld
Copy link
Member

No docs? ;)

@python-dev
Copy link
Mannequin

python-dev mannequin commented Jan 6, 2016

New changeset 2e3c31ab586a by Guido van Rossum in branch '3.4':
Docs for issue bpo-22570.
https://hg.python.org/cpython/rev/2e3c31ab586a

New changeset 408f8b255b56 by Guido van Rossum in branch '3.5':
Docs for issue bpo-22570. (Merge 3.4->3.5)
https://hg.python.org/cpython/rev/408f8b255b56

New changeset 759b2cecc289 by Guido van Rossum in branch '3.4':
Add versionadded (3.4.5) to docs for issue bpo-22570.
https://hg.python.org/cpython/rev/759b2cecc289

New changeset 1a6b485e717f by Guido van Rossum in branch '3.5':
Add versionadded (3.4.5) to docs for issue bpo-22570. (Merge 3.4->3.5)
https://hg.python.org/cpython/rev/1a6b485e717f

New changeset eab349b5c6d7 by Guido van Rossum in branch '3.5':
Cross-reference os.DirEntry and pathlib.Path for issue bpo-22570.
https://hg.python.org/cpython/rev/eab349b5c6d7

New changeset 97ab0ccac893 by Guido van Rossum in branch 'default':
Docs for issue bpo-22570. (Merge 3.5->3.6)
https://hg.python.org/cpython/rev/97ab0ccac893

@gvanrossum
Copy link
Member

So, I added docs, mentioning the getattr(arg, 'path', arg) idiom, and (for 3.5 and 3.6) also cross-referencing with DirEntry.

I'm not sure whether to now close this issue or whether to leave it open to remind people of adding patches using the new idiom to various stdlib modules. Opinions?

Also, since pathlib is provisional, I felt okay with adding this to 3.4.5 and 3.5.2.

@brettcannon
Copy link
Member

We could leave this open as a meta issue and spin off individual issues for any specific module people choose to update to support Path objects, setting those new issues as dependencies here.

@serhiy-storchaka
Copy link
Member

Opened bpo-26027 for adding support in the posix module.

Should the path attribute be only string, or other types are acceptable (bytes, file descriptor, and even None if the function accepts None)?

@gvanrossum
Copy link
Member

Let's say that the path attribute should be str or bytes -- this matches
the behavior of DirEntry. (But for pathlib.Path it is always a str.) It
cannot be None or FD. But note that the getattr(x, 'path', x) idiom returns
x unchanged if x is None or an FD (or a stream, for that matter).

@cool-RR
Copy link
Mannequin

cool-RR mannequin commented Jan 10, 2016

Here's an alternate idea I thought of now. Maybe path.path should be a pointer to the same Path object, so every function will do this:

str(arg.path) if hasattr(arg, 'path') else arg

What I like about this is that it avoids having the path argument be a string. I don't like misnomers :)

I'd understand though if people won't like this suggestion since it introduces another step, of turning the path back into a string.

@brettcannon
Copy link
Member

I appreciate the thought, Ram, but I'm not a fan.

@gvanrossum
Copy link
Member

Using 'path' for this field is not a misnomer (example: DictEntry uses 'path' for the same field).

See issue bpo-26027 for a follow-up idea.

@serhiy-storchaka
Copy link
Member

str(arg.path) if hasattr(arg, 'path') else arg

DirEntry.path can be bytes.

@gvanrossum
Copy link
Member

You can't win 'em all... :-)

@python-dev
Copy link
Mannequin

python-dev mannequin commented Apr 6, 2016

New changeset d0c8b2c1544e by Serhiy Storchaka in branch '3.5':
Issue bpo-22570: Renamed Py_SETREF to Py_XSETREF.
https://hg.python.org/cpython/rev/d0c8b2c1544e

New changeset 719c11b6b6ff by Serhiy Storchaka in branch 'default':
Issue bpo-22570: Renamed Py_SETREF to Py_XSETREF.
https://hg.python.org/cpython/rev/719c11b6b6ff

New changeset 7197809a7428 by Serhiy Storchaka in branch '2.7':
Issue bpo-22570: Renamed Py_SETREF to Py_XSETREF.
https://hg.python.org/cpython/rev/7197809a7428

@serhiy-storchaka
Copy link
Member

Sorry, these changesets were related to bpo-26200.

@gvanrossum
Copy link
Member

Reopening as we need to rename the path attribute to __fspath__ once Brett's PEP is accepted (which will be soon). https://github.com/brettcannon/path-pep/blob/master/pep-0NNN.rst

The 3.4 and 3.5 versions of this should probably just be reversed before their releases (early June).

@gvanrossum gvanrossum reopened this May 12, 2016
@gvanrossum
Copy link
Member

PEP-519 is accepted now. We need to revert the commits from http://bugs.python.org/issue22570#msg257634

@gvanrossum
Copy link
Member

And those from http://bugs.python.org/issue22570#msg257632 (these are the actual code -- the others were the docs).

@gvanrossum
Copy link
Member

Done. The revs are 90e58a77d386, 97198545e6c3, ade839421b8f.

@ethanfurman ethanfurman changed the title Better stdlib support for Path objects Better stdlib support for Path objects using .path attribute Jun 2, 2016
@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

8 participants