classification
Title: option to not follow symlinks when globbing
Type: enhancement Stage:
Components: Library (Lib) Versions: Python 3.6
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: ethan.furman, o11c, wim.glenn
Priority: normal Keywords: patch

Created on 2017-02-07 21:26 by o11c, last changed 2017-09-12 23:05 by wim.glenn.

Files
File name Uploaded Description Edit
python-glob-symlink.diff o11c, 2017-02-07 21:26 Patch to add symlinks= argument to glob.glob and glob.iglob review
Messages (3)
msg287256 - (view) Author: Ben Longbons (o11c) * Date: 2017-02-07 21:26
Background:
I have a data hierarchy with a lot of "sibling" symlinked directories/files. I want to glob only the non-symlink files, because it's a *huge* performance increase.

Before `os.scandir`, I was using a local copy of `glob.py` and calling `os.path.islink` every time, which was faster for *my* use case, but unacceptable for upstreaming. With `os.scandir`, my new patch should be acceptable.

The patch includes tests.

Current discussion points:
* Am I making the right decision to still accept symlinks for fully-literal components (in glob0)? It doesn't apply to my use-case, and I imagine some people might want to handle that case separately.
* Are my tests sufficient? I just copied and modified the existing symlink tests.
* Should my `flags` TODO be implemented *before* this patch? IMO it would be clearer after, even if it makes the diffs longer.

Future discussion points (don't derail):
* Should my `flags` TODO be implemented internally (this would significantly shrink any future patches)? (I can work on this)
* Should `flags` also be exposed externally?
* What additional `flags` might be useful? (my list: GLOB_ERR, GLOB_MARK, ~GLOB_NOSORT, ~GLOB_NOESCAPE, GLOB_PERIOD, GLOB_BRACE, GLOB_TILDE_CHECK, GLOB_ONLYDIR (+ equivalent for files - also, why doesn't `os.scandir` have accessors for the other types without doing an unnecessary stat?))
* Is there a bitwise enum (or equivalently, enum set) in the standard library so `flags` can get sane reprs? (I've implemented this before, but imagine it would be overwhelmed with bikeshedding if it doesn't exist yet)
* Should `pathlib` really be implementing globbing on its own? That makes it hard to ensure feature parity. Perhaps the `glob` module needs some additional APIs? (I don't want to work on `pathlib` itself)
msg287279 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2017-02-08 05:25
Before talking about the patch, have you signed the Contributer License Agreement yet?  The issue tracker isn't showing that you have.  Check out https://www.python.org/psf/contrib/contrib-form/ to do so.

---

The `symlinks` flag: can you give some glob examples showing when, and when not, symlinks will be matched when the symlinks param is False?

A bit flags enum:  There is now an IntFlag Enum type which implements bit flags.
msg302002 - (view) Author: wim glenn (wim.glenn) * Date: 2017-09-12 23:05
+1, would like to use this feature too, and I would like it also in pathlib.PosixPath.glob
History
Date User Action Args
2017-09-12 23:05:45wim.glennsetnosy: + wim.glenn
messages: + msg302002
2017-02-08 05:25:32ethan.furmansetnosy: + ethan.furman
messages: + msg287279
2017-02-07 21:26:15o11ccreate