classification
Title: Better path handling with argparse
Type: enhancement Stage: resolved
Components: Documentation Versions: Python 3.10, Python 3.9
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: ascola, eric.smith, miss-islington, paul.j3, rhettinger, xmorel
Priority: normal Keywords: patch

Created on 2020-12-04 22:42 by ascola, last changed 2020-12-20 18:51 by rhettinger. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 23849 merged rhettinger, 2020-12-18 23:13
PR 23869 merged miss-islington, 2020-12-20 18:15
Messages (16)
msg382542 - (view) Author: Austin Scola (ascola) Date: 2020-12-04 22:42
One of the types of arguments that I find myself most often passing to `argparse.ArgumentParser` is paths. I think that I am probably not alone in frequent usage of paths as arguments. Given this, it would be extremely helpful to have an `argparse.Action` in `argparse` or an type similar to `FileType` that converted the string to a path. A path type factory could also have an arguments to optionally check if the path exists, or is a directory, or other similar predicates.
msg382599 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2020-12-06 16:19
I think this would be a type, not an action.

I'm not sure this would pass the bar of something that should be added to the stdlib. But in any event, it should be developed on PyPI first, perhaps by adding it to argparse-types.
msg382692 - (view) Author: Austin Scola (ascola) Date: 2020-12-07 22:41
Hey Eric,

Thanks for the response. I'm unfamiliar with the process of adding features to the language. Would you mind explaining to me what some of the qualifications are for getting something added to the stdlib? And also what role packages on PyPI play in that process?

I am also willing (and interested) in helping out in implementing this, but I don't want to step on anyone's toes so if the offer for help is rejected I will not be upset.

Thanks!
msg382696 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2020-12-07 23:15
Hi, Austin.

If it's something that can be implemented in a library (which this suggestion qualifies as), then we typically want to see it on PyPI and to gain some traction there. I only suggested argparse-types because it also has some argparse add-ons. But you could certainly make a new package on PyPI.

But even then I'm not sure this would make it into the stdlib. For add-on functionality we're usually happy to leave things on PyPI. On the other hand, paths are somewhat more fundamental than some of the things in argparse-types. You might want to bring it up on the python-ideas mailing list and see how much support you get there.
msg382704 - (view) Author: Austin Scola (ascola) Date: 2020-12-08 01:32
Awesome, thank you for the guidance Eric. I'll start a thread on the python-ideas mailing list, gauge the level of support, and go from there.
msg382707 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2020-12-08 02:31
What exactly do you do with a path argument?

Admittedly I'm not expert with os and os.path modules, but isn't a path just a string passed to a function such as cwd(), or joined to another create a filename.

I don't know what a 'path' type or action would do.
msg382716 - (view) Author: Xavier Morel (xmorel) * Date: 2020-12-08 08:08
> What exactly do you do with a path argument?

Because they mention "convert[ing] the string to a path", I would expect an output of `pathlib.Path`, optionally checked for existence / non-existence and / or kind (file, directory, symlink, ...).

Obviously it is rather easy to bring your own, but OP's expectation is that paths input is common enough (especially in smaller standalone scripts I would expect) that having a helper in a standard library would be... helpful.
msg382758 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2020-12-08 18:12
The pathlib.Path is new since I paid much attention to os matters (I cut my teeth on py2.5).

Off hand it looks like the user could

    import pathlib
    parser.add_argument('-p', type=pathlib.Path)

to convert a string into a Path object.

A custom type function could call this, and apply any desired methods before returning the Path object. But that should be up to the user, not the `argparse` developers.

Importing path specific modules such as pathlib (which in turn has a lot of imports) goes against the attempt to reduce the number of unnecessary imports with modules like argparse.

https://bugs.python.org/issue30152
Reduce the number of imports for argparse

After this diet argparse still imports os, and uses:

    os.path.basename

I might also note that argparse has only one custom 'type' function, the FileType factory.  Other common type functions like 'int' and 'float' are Python builtin's.  Adding a custom 'bool' (to be used instead of the builtin 'bool') has been rejected.

https://bugs.python.org/issue37564

json, yaml, and datetime types have also been rejected
https://bugs.python.org/issue35005
msg382760 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2020-12-08 19:35
The more I think about this, the more I think it shouldn't be in the stdlib. paul.j3 is correct that the simple case is just type=pathlib.Path.

For something more adventurous you could start with:

@dataclass(eq=True, frozen=True)
class ArgumentPath:
    must_exist: bool = False
    # Add other conditions as needed.

    def __call__(self, val):
        result = Path(val)
        if self.must_exist:
            if not result.exists():
                raise ValueError(f"path {result} must exist")
        return result

The reason I think this shouldn't be in the stdlib is that there are race conditions here between when you inspect the filesystem and when you'd actually use the path. What if the file was deleting, or it went from being a directory to a file?

I think the best advice is to use type=pathlib.Path, and handle anything else when you try to cd, or open, or whatever it is you're doing with the path.

It probably wouldn't hurt to document type=pathlib.Path in the argparse docs.
msg382765 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2020-12-08 20:40
Agreed that we should not include new functionality here and instead just update the docs to show what can currently be done.

I'll write-up a documentation patch for this.
msg382766 - (view) Author: Austin Scola (ascola) Date: 2020-12-08 21:35
I think that type=pathlib.Path is probably all that I was looking for here. I was unaware types could be passed easily like that and so updated documentation would definitely be helpful. The predicates such as existence would just have been nice-to-haves, but not necessary (and eric.smith brings up a good point about race conditions).
msg382772 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2020-12-08 23:01
One caution - the type parameter is a callable (function) that takes one string as argument.  I proposed `pathlib.Path` because it does that, returning a Path object.  It's a class instance creator.  I believe the module has other class initiators.

bool() has confused many users.  While it returns a bool class instance, True or False, the only string that returns False is the empty one, with argparse can't supply. It does not convert strings like 'False' or 'no' to boolean False.
msg383331 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2020-12-18 23:20
Eric and Paul, I've attached a substantial rewrite of the docs for the *type* parameter:

* Document the exceptions that are handled.
* Show a wider range of examples that work with *type*.
* Discuss when *type* shouldn't be used:  bool, JSONDecoder, etc.
* Create a less whimsical example of a user-defined type converter.
* Explain the interaction between *type* and *default*.

Let me know what you think.
msg383372 - (view) Author: Austin Scola (ascola) Date: 2020-12-19 11:40
Thank you Raymond! I learned a few things by reading the proposed documentation updates.
msg383434 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2020-12-20 18:15
New changeset b0398a4b7fb5743f6dbb72ac6b2926e0a0c11498 by Raymond Hettinger in branch 'master':
bpo-42572:  Improve argparse docs for the type parameter. (GH-23849)
https://github.com/python/cpython/commit/b0398a4b7fb5743f6dbb72ac6b2926e0a0c11498
msg383439 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2020-12-20 18:51
New changeset 40b4c405f98f2d35835ef5d183f0327c0c55da6f by Miss Islington (bot) in branch '3.9':
bpo-42572:  Improve argparse docs for the type parameter. (GH-23849) (GH-23869)
https://github.com/python/cpython/commit/40b4c405f98f2d35835ef5d183f0327c0c55da6f
History
Date User Action Args
2020-12-20 18:51:41rhettingersetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2020-12-20 18:51:27rhettingersetmessages: + msg383439
2020-12-20 18:15:14miss-islingtonsetnosy: + miss-islington
pull_requests: + pull_request22731
2020-12-20 18:15:02rhettingersetmessages: + msg383434
2020-12-19 11:40:13ascolasetmessages: + msg383372
2020-12-18 23:20:18rhettingersetmessages: + msg383331
2020-12-18 23:13:39rhettingersetkeywords: + patch
stage: patch review
pull_requests: + pull_request22711
2020-12-08 23:01:42paul.j3setmessages: + msg382772
2020-12-08 21:35:56ascolasetmessages: + msg382766
2020-12-08 20:40:41rhettingersetassignee: rhettinger
messages: + msg382765
components: + Documentation, - Library (Lib)
versions: + Python 3.9, Python 3.10
2020-12-08 19:35:29eric.smithsetmessages: + msg382760
2020-12-08 18:12:30paul.j3setmessages: + msg382758
2020-12-08 08:08:34xmorelsetnosy: + xmorel
messages: + msg382716
2020-12-08 02:31:41paul.j3setmessages: + msg382707
2020-12-08 01:32:35ascolasetmessages: + msg382704
2020-12-07 23:15:28eric.smithsetmessages: + msg382696
2020-12-07 22:41:02ascolasetmessages: + msg382692
2020-12-06 16:19:09eric.smithsetnosy: + eric.smith
messages: + msg382599
2020-12-05 02:49:21xtreaksetnosy: + rhettinger, paul.j3
2020-12-04 22:42:15ascolacreate