classification
Title: glob.glob does not sort its results
Type: Stage: resolved
Components: Library (Lib) Versions: Python 2.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: Dave Jones, Jim.Jewett, drj, eric.smith, r.david.murray, serhiy.storchaka
Priority: normal Keywords:

Created on 2014-06-13 13:31 by drj, last changed 2015-11-13 11:19 by drj. This issue is now closed.

Messages (13)
msg220441 - (view) Author: David Jones (drj) * Date: 2014-06-13 13:31
```
for f in glob.glob('input/*/*.dat'): print f
```

outputs:

```
input/ghcnm.v3.2.2.20140611/ghcnm.tavg.v3.2.2.20140611.qca.dat
input/ghcnm.v3.2.2.20140506/ghcnm.tavg.v3.2.2.20140506.qca.dat
```

Note that these are not in the right order.  Compare with shell which always sorts its globs:

```
drj$ printf '%s\n' input/*/*.dat
input/ghcnm.v3.2.2.20140506/ghcnm.tavg.v3.2.2.20140506.qca.dat
input/ghcnm.v3.2.2.20140611/ghcnm.tavg.v3.2.2.20140611.qca.dat
```

I think the shell behaviour is better and we should be allowed to rely on glob.glob sorting its result.

Note from the documentation: "The glob module finds all the pathnames matching a specified pattern according to the rules used by the Unix shell". The Unix shell has always sorted its globs.
msg220443 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014-06-13 14:23
I think there is no reason to impose the overhead of a sort unless the user wants it...in which case they can sort it.  I'm -1 on this change.
msg220468 - (view) Author: Jim Jewett (Jim.Jewett) (Python triager) Date: 2014-06-13 16:59
I agree with R. David Murray, but it may be worth adding a clarification sentence (or an example with sorted) to the documentation.  

Changing status to Pending, in hopes that any doc changes would be quick.
msg220472 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2014-06-13 17:50
I agree that glob shouldn't sort. In addition, iglob definitely can't sort, and I don't think you want to have glob sort but iglob not sort.
msg220537 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-06-14 08:43
Actually iglob() can sort (per directory). But I don't think this is too needed feature. In any case you can sort result of glob().
msg254565 - (view) Author: Dave Jones (Dave Jones) * Date: 2015-11-12 22:22
From the bash man-page: "... If one of these characters appears, then the word is regarded as a pattern, and replaced with an *alphabetically sorted* list of filenames matching the pattern".

I would agree that glob.glob shouldn't sort its results (the overhead may be substantial, and there are plenty of use-cases that don't require sorting), but given that the documented behaviour is at odds (implicitly via the shell's documentation) with the implemented behaviour I would argue that it is premature to close this without at least adding a note to the Python docs.

(P.S. in case my comment is received poorly, I'm not the original author of this ticket, and no aspersions should be cast upon drj for my possibly foolish views!)
msg254566 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-11-12 22:36
Technically the docs are not wrong: "matches files according to the rules of the shell" does not say anything about sorting (matching is separate from what is done with the matched filenames; the shell sorts them and inserts them in place, python returns an unsorted list).  It also mentions using listdir, which is documented as unsorted.

That said, it would be reasonable to insert a disclaimer that the returned results are unsorted, unlike the shell.  If you want to open an new issue with a proposed doc patch, that would be fine.
msg254567 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2015-11-12 22:41
Assuming David means "it wouldn't be unreasonable to insert a disclaimer", I agree.
msg254569 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-11-12 22:45
You mean my old English teachers were wrong when they said a positive statement was to be preferred to a double negative? :) :)
msg254571 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2015-11-12 22:55
D'oh! I read your original comment as "it would be unreasonable to insert a disclaimer", and then I wondered why you'd used such a convoluted sentence and reversed your meaning. It's all my fault. Fortunately, I don't think Mrs. McKinley from 11th grade English will come after me after all these years.

Sorry about that. We're in violent agreement.
msg254592 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-11-13 09:30
There is nothing special with glob(). By default the ls command outputs sorted list of files, but os.listdir() doesn't. Python is just lower-level language than Posix shell. You always can call sort() on result.

It is easy to just add sort() after calling os.listdir() in current glob() implementation. It shouldn't significantly affect performance. I would support this feature. But I'm planning to implement glob() with os.scandir(), and it is not so easy to support sorting in that implementation.
msg254598 - (view) Author: Dave Jones (Dave Jones) * Date: 2015-11-13 10:27
As suggested, doc patch attached to new issue 25615.
msg254599 - (view) Author: David Jones (drj) * Date: 2015-11-13 11:19
The original bug report did not mention ls (note serhiy.storchaka). It is a red herring.

I accept that the Python community doesn't care to have glob.glob sorted.
But then I think you should distance yourself from the shell in the documentation.

It currently says:
"The glob module finds all the pathnames matching a specified pattern according to the rules used by the Unix shell"

You could say something like:
"The glob module finds all the pathnames matching a specified pattern, using a syntax inspired by the Unix shell; unlike Unix shell the ordering is not guaranteed"
History
Date User Action Args
2015-11-13 11:19:56drjsetmessages: + msg254599
2015-11-13 10:27:50Dave Jonessetmessages: + msg254598
2015-11-13 09:30:46serhiy.storchakasetmessages: + msg254592
2015-11-12 22:55:06eric.smithsetmessages: + msg254571
2015-11-12 22:45:14r.david.murraysetmessages: + msg254569
2015-11-12 22:41:59eric.smithsetmessages: + msg254567
2015-11-12 22:36:53r.david.murraysetmessages: + msg254566
2015-11-12 22:22:40Dave Jonessetnosy: + Dave Jones
messages: + msg254565
2015-11-11 18:43:55serhiy.storchakasetstatus: open -> closed
stage: resolved
2014-06-14 08:43:47serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg220537
2014-06-13 17:50:03eric.smithsetstatus: pending -> open
nosy: + eric.smith
messages: + msg220472

2014-06-13 16:59:35Jim.Jewettsetstatus: open -> pending

nosy: + Jim.Jewett
messages: + msg220468

resolution: not a bug
2014-06-13 14:23:43r.david.murraysetnosy: + r.david.murray
messages: + msg220443
2014-06-13 13:31:43drjcreate