This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: '*' glob string matches dot files in pathlib
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.6
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: gvanrossum, jitterman, pitrou
Priority: normal Keywords:

Created on 2016-01-13 02:17 by jitterman, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (7)
msg258128 - (view) Author: JitterMan (jitterman) Date: 2016-01-13 02:17
Path('.').glob('*') generates all files and directories in '.' including hidden files (those that begin with '.'). This behavior is inconsistent with the shell and with the old glob module, which only generate hidden files if the glob pattern starts with a '.'.
msg258209 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2016-01-14 18:23
I'm frankly not sure that's a bug. If you want to filter out dotfiles, it is quite easy to do yourself. On the other hand, if pathlib always filtered them out, it would be more cumbersome to get a unified set of results with both dotfiles and non-dotfiles.
msg258210 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2016-01-14 18:32
I've thought about this too, and I've decided it's a feature. As Antoine says, it's easy enough to filter the dot files out if you don't want them; it's slightly complicated to include them if the default is to skip them.
msg258221 - (view) Author: JitterMan (jitterman) Date: 2016-01-14 21:16
The same issues applies with '**', where it is harder to work around. The shell version of '**' prunes out hidden files and directories at all levels, whereas the pathlib.glob() includes all hidden directories.

Major surgery would be required to filter out hidden files and directories from the output of pathlib.glob(), which would be expensive and complicated. Perhaps an option is called for that would allow pathlib.glob() to prune its search to only non-hidden directories if requested. And if that seems like a good idea, perhaps there could also be options that specify that only files or only directories should be returned.
msg258246 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2016-01-14 22:50
If you have that use case you're probably better of using os.walk() so you are not limited to a prune strategy that can be expressed using a single glob pattern (e.g. maybe I want to ignore .git, .hg and __pycache__ but descend into everything else).

I think we would also welcome adding a walk() method to pathlib. But let's please leave glob() alone.
msg259539 - (view) Author: JitterMan (jitterman) Date: 2016-02-04 07:34
Globbing has been with us for almost 50 years, and in all that time it has never matched the hidden files/directories. There may be isolated cases where matching the hidden items is preferred, but generally that is not the case. Indeed, the primary characteristic of being hidden is that it should not be included in globbing. One marks a file or directory to be hidden specifically to mean 'do not include this one when selecting groups of files or directories'.

Once the glob string has been expanded, it is possible to filter out the hidden files and directories, but it very difficult to do so if there are several levels of directories because one has to look for hidden items at all levels of the path.

Globbing has been available and largely unchanged for almost 50 years. I am not the one that is asking for it to be changed. I am asking for it to be returned to what it has always been. Being consistent with bash and other shells is a very important feature. It allows us to offer pathlib globbing to the end user and have it work the way they expect.
msg259816 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2016-02-08 04:43
If you really need an easy way to provide what the shell offers to end users, maybe you could submit a patch that adds an option similar to "GLOBIGNORE" in bash (but the default would be to return everything, which is more regular and more useful for Python programs). Though you might also consider how easy it would be to write a wrapper function around Path.glob() that implements this.
History
Date User Action Args
2022-04-11 14:58:26adminsetgithub: 70284
2016-02-08 04:43:36gvanrossumsetmessages: + msg259816
2016-02-04 07:34:54jittermansetmessages: + msg259539
2016-01-14 22:50:40gvanrossumsetmessages: + msg258246
2016-01-14 21:16:05jittermansetmessages: + msg258221
2016-01-14 18:32:44gvanrossumsetstatus: open -> closed
resolution: not a bug
messages: + msg258210
2016-01-14 18:23:53pitrousetmessages: + msg258209
2016-01-14 18:19:07SilentGhostsetnosy: + gvanrossum, pitrou

components: + Library (Lib)
versions: + Python 3.6, - Python 3.4
2016-01-13 02:17:53jittermancreate