classification
Title: pathlib.glob('**') returns only directories
Type: behavior Stage: resolved
Components: Versions: Python 3.5
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: SilentGhost, jitterman, pitrou
Priority: normal Keywords:

Created on 2016-01-14 21:20 by jitterman, last changed 2016-02-04 07:25 by SilentGhost. This issue is now closed.

Messages (3)
msg258223 - (view) Author: JitterMan (jitterman) Date: 2016-01-14 21:20
The title says it all.

The shell version of '*' and '**' return both directories and files.
Path('.').glob('*') returns both directories and files, but Path('.').glob('**') returns only directories. That seems wrong to me.
msg258237 - (view) Author: SilentGhost (SilentGhost) * (Python triager) Date: 2016-01-14 22:26
It is, however, exactly what documentation says it should do:

> The “**” pattern means “this directory and all subdirectories, recursively”.
msg259537 - (view) Author: JitterMan (jitterman) Date: 2016-02-04 07:20
It may be what the documentation says it will do, but is not what it should do. I believe that because:

1. Currently ** in pathlib matches only directories, but **.py matches files. That seems inconsistent.
2. In bash, and csh, ** matches files and directories. To get the same in pathlib one must use **/*, which is inconsistent with what we have used for many decades.
3. With the traditional meaning of **, it is easy to constrain the match to directories by adding slash to the end of the glob (just use **/).
4. There is considerable value in supporting the traditional meaning of glob strings. Globbing is a very powerful feature, and it is often offered to the end user in shell-like situations. For example, sftp offers globbing. When offering globbing to the end users it is best to be consistent the globbing they are already familiar with.
5. There is no significant advantage to the difference between pathlib globbing and traditional globbing.

Globbing in pathlib is different from traditional globbing in another important way. pathlib does not distinguish between hidden files and directories and normal files and directories.  There may be isolated cases where that is preferred, but generally that is not true. Indeed, the primary characteristic of being hidden is that it should not be included in globbing. One marks a file or directory to be hidden specifically to mean 'do not include this one when selecting groups of files or directories'. Once the glob string has been expanded, it is possible to filter out the hidden files and directories, but it very difficult to do if there are several levels of directories because when weeding out the matches that should not be be included you have to look for hidden items at all levels of the path.

Globbing has been available and largely unchanged for almost 50 years. I encourage you to strongly consider making pathlib globbing more consistent with what we have all grown up with.
History
Date User Action Args
2016-02-04 07:25:04SilentGhostsetnosy: + pitrou
2016-02-04 07:20:25jittermansetmessages: + msg259537
2016-01-14 22:26:27SilentGhostsetstatus: open -> closed

nosy: + SilentGhost
messages: + msg258237

resolution: not a bug
stage: resolved
2016-01-14 21:20:32jittermancreate